Classical 

and Modern 

Direction-of- 

Estimation 




Engin Tuneerand 
Benjamin Friedlander 



Academic Press is an imprint of Elsevier 
30 Corporate Drive, Suite 400 
Burlington, MA 01803, USA 

This book is printed on acid-free paper. @ 

Copyright © 2009 by Elsevier Inc. All rights reserved. 

Designations used by companies to distinguish their products are often claimed as trademarks 
or registered trademarks. In all instances in which Academic Press is aware of a claim, the 
product names appear in initial capital or all capital letters. Readers, however, should contact 
the appropriate companies for more complete information regarding trademarks and 
registration. 

No part of this publication may be reproduced, stored in a retrieval system, or transmitted 
in any form or by any means, electronic, mechanical, photocopying, scanning, or otherwise, 
without prior written permission of the publisher. 

Permissions may be sought directly from Elsevier’s Science & Technology Rights 
Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, 
e-mail: permissions@elsevier.com. You may also complete your request on-line via the 
Elsevier homepage ( http://elsevier.com ), by selecting “Support & Contact” then 
“Copyright and Permission” and then “Obtaining Permissions.” 

Library of Congress Cataloging-in-Publication Data 

Application submitted. 

ISBN-13: 978-0-12-374524-8 


For information on all Academic Press publications 
visit our Web site at www.elsevierdirect.com 


Typeset by: diacriTech, India 

Printed in the United States of America 
09 10 11 12 13 54321 


Working together to grow 
libraries in developing countries 

www.elsevier.com | www.bookaid.org | www.sabre.org 


ELSEVIER 


BOOK AID 

International 


Sabre Foundation 







Preface 


The need for direction-of-arrival (DOA) estimation arises in many engineer¬ 
ing applications including wireless communications, radar, radio astronomy, 
sonar, navigation, tracking of various objects, and rescue and other emergency 
assistance devices. In its modem version, DOA estimation is usually studied 
as part of the more general field of array processing. Much of the work in 
this field, especially in earlier days, focused on radio direction finding—that 
is, estimating the direction of electromagnetic waves impinging on one or more 
antennas. 

The problem of acoustic direction estimation was also studied extensively, 
mostly in the context of sonar. In fact much of the development of what is now 
called “modern DOA estimation” was done for sonar applications where the 
relatively small bandwidth of the signals to be processed made the computational 
requirements of advanced algorithms feasible with the technology that existed 
then. As processing power kept increasing, it became possible to apply advanced 
techniques to the more demanding wider bandwidth communications and radar 
signals. 

While DOA estimation is now a mature field with a solid theoretical basis and 
a large number of practical applications, it is still an evolving and quite active 
field of research. This book attempts to provide a snapshot of the most recent 
work on this ubiquitous problem and, at the same time, to provide a brief review 
of the more classical work on direction finding. 

The book contains ten chapters. Chapter 1 by Friedlander lays out the funda¬ 
mentals of the DOA problem. Starting with a discussion of how it all originated, 
it blends both classical and modern techniques. Chapter 2 by Demmel is a good 
reference for practicing engineers. It presents the techniques currently used in 
commercial direction-finding (DF) systems. Chapter 3 by Viberg, Lanne, and 
Lundgren presents a critical topic for sensor arrays, namely calibration. Differ¬ 
ent techniques for calibration are presented and numerically compared in this 
chapter. Chapter 4 by Tuncer, Yasar, and Friedlander discusses narrowband and 
wideband processing for DOA estimation. The advantages of array interpolation 
and processing gain achieved by wideband processing are outlined. Chapter 5 
by Riibsamen and Gershman presents techniques for search-free DOA estima¬ 
tion for different arrays. Such techniques allow one to use fast algorithms for 
some unconventional array structures. 

Chapter 6 by Amin and Zhang introduces a new dimension to the DOA 
problem—namely, spatial time-frequency distributions. Direction-of-arrival 
estimation can be improved as a result of signal-to-noise ratio (SNR) 
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enhancement and source discrimination in the time-frequency domain. Chapter 7 
by Abramovich, Johnson, and Mestre discusses an interesting problem in the 
threshold region. The expected likelihood approach is used as a mechanism to 
assess the quality of estimates for the low sample case. Chapter 8 by Chevalier, 
Ferreol, and Albera presents the advantages of higher-order statistics compared 
to second-order statistics for direction of arrival. It has been shown that a virtual 
increase of the array aperture by the introduction of virtual sensors can be used 
to improve resolution and modeling errors. Chapter 9 by Chen and Yao discusses 
the localization problem in sensor networks. Both maximum likelihood formu¬ 
lation and performance bounds are presented for source localization. Chapter 10 
by Amar and Weiss advocates direct position determination for source localiza¬ 
tion. It has been shown that superior results can be obtained at low SNR even 
when there are modeling errors. 
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Wireless Direction-Finding 
Fundamentals 


Benjamin Friedlander 


1.1 INTRODUCTION 


Wireless direction finding has a long, and distinguished history going back to 
the very beginnings of wireless communications around the turn of the twentieth 
century. Many of its basic concepts had been established by that time. To put 
this in perspective we quote the opening sentence of the Foreword (by T. L. 
Eckersle y, F.R . S.) to a book first published in 1922 entitled Wireless Direction 
Finding ( Keen . 1938 ): 


Everything that the technician requires to know about the now well-established art 
of Radio Direction and Position Finding will be found most lucidly explained in the 
following chapters. 

The first attempts at direction finding (DF) made use of the directional c 
teristics of anten na elements (dipoles, loops, etc.) (IBellini and Tosilll907 


Marconi . 


tiarac- 


19091 : 


19061) . However, the use o 


arrays did not lag far behind (lAdcockl 


multiple antenna s and phased antenna 


1919 


<Ceen 


HH). 


Many advances have been made in the field of direction finding since the 
early 1900s, for the most part due to advances in technology rather than new 
concepts or principles JTravers and Hixon . 1966 ). The remarkable developments 
in electronics and in devices and components for generating and amplifying 
signals at higher and higher frequencies have greatly extended the capabilities 
and applicability of direction-finding systems. The shift from analog to digital 
technology has increased tremendously the flexibility of such systems and their 
ability to monitor and keep track of signals over a wide range of frequencies and 
large geographical areas. However, the basic direction-finding techniques have 
not changed much, with the exception of the introduction of “super-resolution” 
algorithms for multiple co-channel signals. 
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Wireless Direction-Finding Fundamentals 


In this chapter we describe some of the basic direction-finding methods for 
antenna arrays and discuss their performance characteristics in the presence of 
noise, multipath, and interference. We then provide a brief introduction to 
co-channel multi-emitter direction finding and super-resolution, particularly the 
merits and drawbacks of super-resolution methods and their relationship to 
classical methods. 

1.2 PROBLEM FORMULATION 

Consider an emitter transmitting a signal s(t)e^ Wct , where s(t ) is the baseband 
signal, and w c = 27r/ c , where/ c is the carrier frequency. This signal is received by 
an array of antennas as depicted in Figure [lj] The received signals are delayed 
versions of the transmitted signal 


Xpb(t) — 


s(t-Ti)ej w ^- x ^ 

s(t-z 2 )eJ w ^- T2) 


( 1 . 1 ) 


s(t — Xm)^ W c ^ Tm 1 
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Problem Formulation 


where x p b(t) is the vector of the received passband signals and z m are the 
propagation delays determined by the direction of the source relative to the 
array. For example, in the case of a linear array 

Tm = To - (dm/c) sin 0 (1.2) 

where to is the propagation delay from the emitter to a reference point on the 
array, d m are the distances of the array elements from that reference point, c is 
the speed of light, and 6 is the emitter direction measured relative to the line 
perpendicular to the array. Without loss of generality we can assume that to = 0. 
(This is equivalent to defining s(t ) as the baseband signal at the reference point 
of the receiver array, not as the source signal.) Thus, in the rest of this chapter 
we will assume that 


rm = -(d m /c) sin# 


(1.3) 


We note that the contribution of To to the phase e~^ WcT ° introduces an unknown 
random phase term that is common to the elements of the received signal vector. 
Because of this phase term, all direction-finding methods must be designed to 
be invariant to a common additive phase. 

After transforming the passband signals to baseband, we have 



s(t — z\)e^ WcTl 
s(t — x 2 )e^ Wct2 


(1.4) 


s(t-ZM)e jWcTM 


where x(t) is the vector of received baseband signals. 

Let D denote the aperture of the array in wavelengths A. Then Dk/c = D/f c 
is the time it takes the signal to propagate over the array, so that r m < D/f c . It 
follows that, if the signal bandwidth is B and B<^f c /D, s(t — z m ) ~ s(t ). The 
condition B<^f c /D or B/f c <<C 1 /D implies that s(t) is a narrowband signal. It 
follows that, under the narrowband assumption, the received baseband signal 
can be written as 


x(t)=s(t) 


e jWc? 1 
e jw c T2 


(1.5) 


e jWc*M 


For this reason, we define the so-called “array manifold” a(0), as the array 
response to a unit amplitude signal (i.e., the case where s(t) = 1). 

exp {jlndi sin 0/X] 
exp { j 2 nd 2 sin 0/X] 


exp {jlndM sin 6/k} 


m= 


( 1 . 6 ) 
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With no loss of generality we assume that the signal s(t ) has unit power, in which 
case the complete narrowband model for the received signal can be written as 

y(t) = VSNRs(t)a(d 0 ) + v(t) (1.7) 

where v(7) is a multivariate complex Gaussian noise vector with uncorrelated 
elements having zero mean and unit variance, Go is the direction of the emitter, 
and SNR is the signal-to-noise ratio. 

Direction-finding methods often use sampled versions of the array output, in 
which case 


y [k] = VSNRs[/fc]a(<9 0 ) + v[&] (1.8) 

where k is the index of the sample, or the “snapshot.” The noise \[k] is assumed 
to be independent from snapshot to snapshot. The signal s[/:] may or may not be 
independent from snapshot to snapshot, depending on its temporal correlation 
properties. Equation (first defines the narrowband signal model used in the rest 
of this chapter. 

1.3 DIRECTION-FINDING ALGORITHMS 

We now describe three common methods for direction finding using an antenna 
array. The first two involve linear combinations of the received signals; the third 
is nonlinear. 

1.3.1 Beamforming 

Combining the antenna outputs so that the signals from a given direction “line 
up,” and can thus be added coherently, is the fundamental method used in array¬ 
processing applications. Recall that the passband signal received by the array’s 
rath element is x m (t ) = s(t — r m (Go))e^ Wc(yt ~ Tm ^ eo \ where the delays are functions 
of the emitter direction Go and the array geometry. 

With proper delays applied to them, the received signals line up. Assume that 
x m {t) is delayed by r — x m {Gf) to yield x m (t — T — r m (%)) = s(t — f)e^ Wc ^~ T ^ . In 
other words, the properly delayed array outputs are identical and can be added up 
to enhance the signal-to-noise ratio at the receiver by a factor of M, the coherent 
array gain. 

If a different set of delays is used corresponding to 0 ^Go, the signals will 
not line up and the result of the sum will be a signal with power that is smaller 
than in the case where 6 = 6$. Thus, the signal power at the beamformer output 
will be maximized in the correct direction. 

The processing just described, called delay-and-sum beamforming , can be 
used for both wideband and narrowband signals. In the case where a signal is 
narrowband, the beamformer can be implemented using appropriate phase shifts 
instead of time delays. As was shown in Equation (11.51) the received signal 
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vector is represented (up to a common factor) by the array manifold a (G). 
Combining the signals “in phase” is accomplished by the complex conjugate 
of the array manifold. Thus, we compute \a(0) H a(Oo)\. If the assumed direction 
6 equals the true emitter direction, we get the largest value of the combined 
signals \sl(6o) h sl(6o)\ = M.lfG^ Go, the signals are not added in phase and the 
result of the sum is smaller. This type of processing is called frequency domain 
beamforming or just beamforming. 

In its most general form, a beamformer uses a weight vector W (G) to lin¬ 
early combine the array outputs. In most applications we are interested in the 
magnitude or the power of the combiner output. In the latter case the output is 
given by 

p(ff) = \W(0)"yl 2 (1.9) 

where y is the received signal. In the absence of noise and assuming a unit power 
signal, the beamformer output is 

p(0) = SNR\W(6) H a(6o)\ 2 (1.10) 

When the beamformer is used for direction finding, its output p(G) is computed 
over a range of directions G. The direction where p(G) reaches its largest value 
is the estimated emitter direction. 

In this discussion we assume that W( G ) = a(0). In practice, this weight vector 
will be modified by a window chosen to suppress the sidelobe level of the beam 
pattern to a desired level. We use a normalized but non windowed weight vector 
W (G) = a(0)/|a(0) |, which makes the noise power at the beamformer output the 
same as at the antenna elements. 

It should be noted that the beamformer with this choice of weight vector is 
in fact the generalized maximum likelihood estimator of the direction 0, for the 
signal model 

y = o' O a(0o)+v (1.11) 

where y is the vector of the signals at the array output, ao is a complex scale 
factor assumed to be unknown (ao = VSNRs in the previous notation), a (G) is 
the array manifold, Go is the unknown emitter direction, and v is a noise vector 
composed of independent zero-mean unit-variance Gaussian random variables. 
The generalized maximum likelihood estimate is given by joint minimization 
of the error |y — aost(Oo)\ 2 over the unknowns Go and ao. Minimizing first over 
ao we have ao = a H (Go)y/a H (Go)a(Go). Inserting this into the error function we 
have |y — a(0o)a H (Go)/ |a(0o) | 2 y I 2 - Minimizing this cost function with respect to 
Go is equivalent to maximizing y H 2 l(Go)& h (Go)y / \&(Go)\ 2 , which can be written 
as | W H (Go)y\ 2 , the output of the beamformer presented earlier. Thus, the estimate 

/V 

Go obtained by maximizing the output power of the beamformer is the generalized 
likelihood direction estimate. 

Figure fk2l depicts the beam pattern p(0) for 16-element circular and linear 
arrays in the case where Go = 0. As can be seen, p(G) is maximized at the source 
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FIGURE 1.2 Beam patterns (nonwindowed) for 16-element linear (a) and circular (b) arrays with 
uniformly spaced elements (half-wavelength spacing). 


direction Oq. In the absence of noise (and practical imperfections), this method 
provides a perfect estimate of direction. In the presence of noise, the peak location 
is shifted, introducing a direction estimation error. The accuracy of the estimated 


direction will be addressed in Section 1.4 


1.3.2 Monopulse 

An interesting variation of the beamformer involves a method, often referred to 
as monopulse , commonly used in radar systems for target tracking. This method 
involves taking the difference between the outputs of two beams pointing in 
slightly different directions. Let 

im = f {|a H (0 + A/2)y| 2 - \a H (0- A/2)y| 2 } (1.12) 

denote the response of the monopulse system where 1/A is a convenient 
normalization factor. In other words 

b(0) = ^{p(9 + A/2)-p(9-A/2)) 


(1.13) 































Direction-Finding Algorithms 


where p(0) is the beampattem defined in the previous section. If A is small, this 
difference can be replaced by the derivative of p(0): 

b(6)^p{6) (1.14) 

In fact, A can be on the order of one beamwidth and b(6) will still be well 
approximated by p(6) because b(6) is nearly linear over a significant range of 
angles around the zero-response point. This can be seen in Figure 11.31 which 
depicts the response function b(6) for 16-element circular and linear arrays in 
the case where Go = 0- As can be seen, b(0) = 0 at the source direction Go. In the 
absence of noise (and practical imperfections), this method provides a perfect 
estimate of direction. In the presence of noise, the zero location is shifted, intro¬ 
ducing a direction estimation error. The accuracy of the estimated direction will 
be addressed in Section [L4l 

Note that the output is positive if the emitter is to the right of the direction in 
which the difference beam is pointed and negative if it is to the left. This ability 
to determine the direction of the emitter relative to the pointing direction by the 
sign of the response is useful in tracking applications. 




FIGURE 1.3 Response of a monopulse system for 16-element linear (a) and circular (b) arrays, 
with uniformly spaced elements. 
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1.3.3 Phase Matching 

The methods just described involve a linear combining of the antenna outputs 
followed by subsequent processing. Here we consider a nonlinear method that 
uses the phase of the received signals rather than the signals themselves. In the 
case of a single emitter with no multipath, information about its direction resides 
entirely in the signals’ phases (delays), not in their amplitudes. Using phase 
information only is therefore expected to be as accurate as using the signal itself, 
as well as potentially more robust to effects that cause amplitude (but not phase) 
impairments such as unmatched antenna and amplifier gains. 

Let 0 denote the vector of phase measurements and 0(0) denote the vector of 
phases of the array manifold a (0). The direction estimate is obtained by finding 
the value of 0 for which the distance between the measured and assumed phase 
vectors is minimized. The squared norm of the difference between the phase 
vectors can be used as the distance measure: 

d(6)=ti-m\ 2 (i.i5) 


As discussed earlier, the phase of the received signal is the sum of the 
geometry/direction-related phase and a random phase component common to 
all antennas. This random component must be eliminated or otherwise taken 
care of for the direction-finding algorithm to work properly. A common way to 
achieve this is to measure relative rather than absolute phases, which can be done 
by designating one of the antennas as a phase reference and measuring phases 
relative to it. The phase of the reference antenna is assumed to be zero. Similarly, 
the array manifold a(0) is computed so that the phase of the reference antenna 
is zero. The performance of this method is influenced by the particular antenna 
chosen as the reference. An alternative approach is to eliminate dependence on 
a particular reference antenna by using the mean of the phase vector as the ref¬ 
erence. In other words, we force both the measured and manifold phase vectors 
to have zero mean. 

A more systematic approach to this problem is to include an unknown additive 
phase term /3 in the signal model and jointly estimate the unknown direction and 
additive phase terms. In other words, 

min \<p — <p(0) — 1/?| 2 (1.16) 


where 1 is an M x 1 vector with elements that are 1. Minimization over /3 yields 

1 


p=-V (0-0(0)) 


d-17) 


Inserting this into the cost function being minimized, we obtain 


min 

6 



(1.18) 


which is “optimal” from a statistical standpoint, but somewhat more complicated 
than the methods presented earlier. 
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FIGU RE 1.4 Phase cost function versus direction for 16-element arrays: (a) linear and (b) circular, 
with uniformly spaced elements. 


Figure fL4l depicts the phase cost function d(0) in Equation (11.15b for 16- 
element circular and linear arrays in the case where Go = 0- As can be seen, d(0) 
is minimized at the source direction 0q. In the absence of noise and practical 
imperfections, this method provides a perfect estimate of direction. In the pres¬ 
ence of noise, the location of the minimum is shifted, introducing a direction 
estimation error. The accuracy of the estimated direction will be addressed in 


Section 1.4 


1.4 DIRECTION-FINDING ACCURACY 

The accuracy with which the direction of an emitter can be determined is 
an important parameter of any direction-finding system. It depends on imple¬ 
mentation imperfections, propagation effects (multipath, variabilty of the angle 
of arrival), interference, signal-to-noise ratio, and so forth. In this section we 
consider the accuracy of an ideal system where the only effect corrupting the 
measurements is additive measurement noise. 

Direction-finding accuracy is measured by the mean-square error (MSE) of 
the estimator: 


MSE{0} = £{|0 —<9| 2 } 


d-19) 
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where 0 is the direction estimate and 0 is the true direction. Throughout this 
section we consider direction-finding methods that provide unbiased estimates 
of direction. In other words, 

E{6} = 0 (1.20) 

In this case, the MSE equals the variance of the estimated direction: 

MSE{0} =E{\9-E{0}\ 2 } = VAR{0} (1.21) 

In the biased case, 

MSE{0} =E{ |(0- E{9}) -(0-E[9}\ 2 }= VAR0) + 1 9-E{6 )| 2 (1.22) 

where b(0) = 0 — E{6 } is the bias of the estimator. 


1.4.1 The Cramer-Rao Bound 


The Cramer-Rao bound (CRB) (lCramert.119511) is a useful tool for assessing the 
accuracy of parameter estimation methods, as it provides a lower bound on the 
accuracy of any unbiased estimator. There are versions of the CRB for biased 
estimators as well, but we will not discuss them here. 

/V 

For any unbiased estimate 6, 

MSE{0} = VAR {0} > CRB (1.23) 

The CRB provides an algorithm-independent benchmark against which various 
algorithms can be compared. 

In Section IA.1.21 we derive the CRB for the direction-finding problem and 
show that 


CRB 


1 




2K SNR|a(0)| : 


(1.24) 


where K is the number of snapshots used by the estimator, SNR is the signal- 
to-noise ratio of the signal received at each antenna, and a (6) is the derivative 
of the array manifold with respect to 0. This is a simple expression that shows 
the variance of the direction estimation error to be inversely proportional to the 
post-integration signal-to-noise ratio K SNR and to a factor |a(0) | 2 that depends 
on the particular array used by the system. 

Evaluating |a(0)| 2 , we obtain more explicit forms of the CRB (see Sec¬ 
tion E32J)- For an M-element linear array we get 


CRB 






8tt 2 K SNR cos 2 Od 2 


(1.25) 


where 


M 


cfl = d , 


2 

m 


(1.26) 


m= 1 


and d m is the distance of the rath antenna from the phase center of the array. 
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For a uniformly spaced linear array, 


d m = (m- 1) 


L 


M- 1 


L 

2 


(1.27) 


where L is the array aperture. In this case, d 2 can be expressed in closed form as 
a function of L and M. 

For an M-element circular array with radius R , we have 

A 2 

CRB ~-=— (1.28) 

87r 2 K SNRR 2 c 2 (6) 

where 

M 

?(6>) = y] cos 2 (1.29) 

m =1 

and 0 m are the antenna location angles on the circle. 

For a uniformly spaced circular array it is straightforward to show that 

c 2 =— for 0 = 2(m — 1 )n/M, m = 1..... M (1.30) 


1.4.2 Simulation 


To evaluate the accuracy of the direction-finding algorithms described in 


Section 11.31 we used a Monte Carlo simulation generating 5000 independent 
data vectors y, estimated the emitter direction, and computed the estimates bias 
and standard deviation. An 8-element uniformly spaced linear array with half¬ 
wavelength spacing was used, and the emitter was at 0q = 10°. Estimates were 
based on a single snapshot. Figures 11.51 and 11.61 depict the standard deviation 
and bias of the estimated direction for the three methods as a function of the 
signal-to-noise ratio. Figure [13] includes the CRB for reference. 

We observe that all three methods are asymptotically unbiased and statisti¬ 
cally efficient. In other words, as the signal-to-noise ratio increases the bias tends 
to zero and the variance (or standard deviation) tends to the CRB. Results for a 
circular array are similar, and the corresponding figures are therefore omitted. 


In Sections IA. 1 .41 through IA. 1. 81 we present analysis that further validates these 
observations. We conclude that under ideal conditions these classical methods 
are as accurate as possible in the presence of measurement noise. 

It should be noted that the CRB is a “tight” bound only at high signal-to- 
noise ratios. In other words, it is possible to design estimators with a variance 
that approaches the CRB asymptotically as those ratios increase. At low ratios 
the CRB is no longer tight and the variance of any estimator is larger—possibly 
much larger. In plotting the variance of the estimator as a function of the SNR, we 
invariably observe a threshold SNR. When the SNR exceeds this threshold, the 
variance closely tracks the CRB; when it falls below the threshold, the variance 
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FIG U RE 1 .5 Standard deviation of the estimation errors for an 8-element uniformly spaced linear 
array using beamforming, monopulse, and phase-matching methods. The CRB is shown for reference. 
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FIGURE 1.6 Bias errors for an 8-element uniformly spaced linear array using the beamforming, 
monopulse, and phase-matching methods. 
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departs sharply from the CRB. The Barankin ( 1949b . Ziv and Zakai (1969), and 
other “global” bounds better indicate achievable performance at a low SNR and 
display this threshold 
In Figures [T31 and fk6 


phenomenon, absent in the CRB, which is purely “local.” 
the threshold SNR can be seen at around 5 dB. 


1.5 MULTIPATH AND CO-CHANNEL INTERFERENCE 

We considered direction finding for a single emitter with the received signal 
contaminated by noise only. In practice, there may be multiple co-channel sig¬ 
nals arriving from different emitters or from a single emitter through multiple 
propagation paths. These spurious signals, especially if their direction is close 
to that of the signal of interest, degrade the accuracy of the direction-finding 
system. Specifically, the direction estimates provided by the methods discussed 
earlier become biased. In this section we study this bias as a function of some of 
the interfering signal parameters. For simplicity we limit our discussion to the 
beamformer described in Section [1.3.11 

A fundamental way to handle multiple co-channel signals is via processing 
techniques capable of resolving the composite signal into its individual compo¬ 
nents and estimating the direction of each component separately. We will discuss 
this approach in more detail in Section fL6l Here we note only that the beamformer 
resolves signals with directions that are separated by more than a beamwidth, 
but fails to resolve those more closely spaced. In this section we focus on closely 
spaced received signals not resolved by the direction-finding algorithm. 

Consider the case where two signals impinge on the array from directions Go 
and 0 \. Without loss of generality, assume that the signal at Go corresponds to the 
emitter the direction of which we want to estimate. The received signal vector is 
given by 

y = ySNR o soa(0o) + >/SNRiJ ia(0i) + v (1.31) 

where SNRo and SNRi are the signal-to-noise ratios of the emitters at Gq and G \, 
respectively. As before, the output of the beamformer is 

p(0) = |W(<y| 2 (1.32) 

where W (0) = a(0)/|a(0)|. The expected value of p(G) is given by 

E{p(e)} = SNR 0 \W(9) H a(9 0 )\ 2 + SNR l \W(e) H a(e 1 )\ 2 

+ 2poi y/ SNR 0 \/ SNRi 91 { W (9) H a(6o) (W (9) H a(9\ ))*} + 1 (1.33) 

where poi = £’{.vo.vj\|, E{|.vo| 2 } = £’{|.vi | 2 } = 1, and ;H{•} denotes the real part. This 
expected value represents the case where the beamformer output is averaged over 
multiple snapshots: 

1 K 

p©=Aiw ( <yWi 2 

A k =1 


d-34) 
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As K becomes large the performance of the nonaveraged beamformer nears that 
of the averaged beamformer (i.e., p(0) ~ E{p(0 )}). 

Consider the case where the two signals are uncorrelated (i.e., 2s {sq,?*} = 0). 
Then 


E[p(ff)} = SNRo|W(O ff a(0o)| 2 + SNRi|W(6>) fl a(0i)| 2 +l (1.35) 


The signals are generally uncorrelated if they originate from two different emit¬ 
ters or if they originate from a single emitter in the presence of motion-induced 
Doppler. 

Next consider the case where the signals originate from a single emitter in 
the absence of motion. In this case, they are the same expect for a complex scale 
factor. Thus, s\ = soe~^, where 0 is the phase of this factor (the relative magni¬ 
tudes are absorbed in SNRo and SNRi. The correlation coefficient is therefore 
Poi = e ~so the average beamformer output is 


E{p(0)} = 


yj SNR 0 W (0) H 2l(0q) + yj SNRi W (0)^a(0i) 


+ 1 (1.36) 


Figures [17711 1.81 and 1 1.9 1 illustrate the effect of the second signal on the esti¬ 
mated direction. They depict results for an 8-element uniformly spaced linear 
array with half-wavelength spacing. In Figures \Tj] and[L9]the signal directions 



FIGURE 1.7 Composite beam response for two uncorrelated signals at$o = 0° and^i =6.5°. The 
power of the second signal is 3 dB below that of the first. The dashed and dot-dashed curves are the 
responses to the individual signals, and the solid curve is the composite response. 
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FIGURE 1.8 Bias-of-direction estimation using a beamformer for two uncorrelated signals for 
different angular separations: 1, 0.5, and 0.25 of a beamwidth. 


are Go = 0° and G\ = 6.5°. The beamwidth of the array is 13°, so the signal direc¬ 
tions are a half-beamwidth apart. In Figure [L8l On = 0° and G\ = 3.250°, 6.50°, 
and 13°. 

Figure [F7l depicts the beam pattern computed using Equation (11.35b and its 
two components corresponding to two uncorrelated signals. The power of the 
signal at G\ is half the power of the signal at Go —that is, the SNR difference is 
3 dB. This figure illustrates two important facts: (1) The beamformer is unable 
to resolve the two signals, so the response has a single peak corresponding to the 
composite signal; (2) the peak location is between Go and 0\, providing a biased 
estimate of Go. 

In Figure 11.81 the bias of the direction estimates is shown as a function of 
the SNR difference between the two uncorrelated signals for three values of the 
separation G\ — Go between the signal directions: 1 beamwidth, 0.5 beamwidth, 
and 0.25 beamwidth. As expected, in all three cases the bias decreases as the 
SNR difference increases. Initially it increases as the separation increases, but at 
some point it begins to decrease. This is to be expected because the beamformer 
is able to resolve the two signals for separations larger than a beamwidth. 

In Figure 11.91 the bias of the direction estimates is shown as a function of 
the SNR difference between the two correlated signals for a separation of 0.5 
beamwidth. The estimates are computed by finding the peak of the beam response 
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FIGURE 1.9 Bias-of-direction estimation using a beamformer for two correlated signals for a 
0.5-beamwidth separation. Shown are the maximum, minimum, and median of the bias computed 
over all possible values of the relative phase between signals. 


in Equation (11.361) . Note that the beam response depends on the correlation 
phase 0. We computed the response and the corresponding direction estimates 
for 0 < 0 < 271, and from this we evaluated the largest, smallest, and median 
values of the bias \6 — 6o\, as shown in the figure. Note that the bias varies over 
a significant range of values and in some cases is larger than in the correlated 
case. Therefore, the bias due to correlated multipath may be larger than that due 
to uncorrelated multipath. More important, the magnitude of the bias is highly 
variable from one scenario to the next. 

1.6 DIRECTION FINDING FOR MULTIPLE 
CO-CHANNEL EMITTERS 

We considered direction-finding methods that associate a single direction with 
the received signal. When the signal is a composite of multiple components, the 
direction estimate of the signal of interest is biased to a smaller or larger degree. 
In this section we consider methods that associate multiple directions with the 
different signal components. 

First we note that, when the emitters are well separated (i.e., their directions 
are more than a beamwidth apart), the techniques discussed earlier are in fact 
capable of estimating the individual component directions. Figure ITTTQl depicts 
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FIGURE 1.10 Beamformer response for two uncorrelated signals at 6 q = —20° and 8 \ = 20°. The 

power of the second signal is 3 dB below that of the first. 


the response of the beamformer to the signals received from two emitters: one at 
direction $o = — 20 and one at 0\ = 20. The signal from the emitter at 0\ has half 
the power of the signal from the emitter at 0q. An 8-element uniformly spaced 
array was used with half-wavelength element spacing. The beam response is 
computed using a Hamming window. It has two clearly identifiable peaks located 
at the emitter directions. Thus, the directions of both emitters can be reliably 
estimated in this case. 

If the emitters are spaced less than a beamwidth apart, the beam response 
will have a single peak, as was shown in Figure [L7l The beamformer will there¬ 
fore fail to resolve the signals, producing a single biased direction estimate. In 
this case other methods are needed to obtain unbiased direction estimates of all 
emitters. The ability to estimate the directions of signals separated by less than a 
beamwidth is often called “super-resolution.” Conventional methods are limited 
by the Rayleigh resolution limit corresponding roughly to one beamwidth. 

The signal model considered in this section is 

y = Q' 0 a(6> 0 )+Q'ia(^i)+v (1.37) 

where y is the vector of the signals at the array output, ao,a\ are com¬ 
plex coefficients representing the magnitude and phase of the two emitters 
(a p = s py /SNR p ,p = 0,1 in the earlier notation), a(0) is the array manifold, and 
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Go, 0\ are the emitter directions. This is a straightforward extension of the model 
presented in Equation dl.7b . and it can be extended in an obvious manner to P 
signals, in which case 

p 

y = '^2a p a(6 p )+\ (1.38) 

p=0 

For simplicity, we consider here only the two-signal case. 

Given the measurement y, we want to estimate the directions Go and 0 \. We 
assume that neither do nor d\ is known to the receiver. 

1.6.1 Maximum Likelihood Direction Estimation 

A conceptually straightforward solution to the problem defined previously is to 
compute the maximum likelihood estimator (MLE) for the unknown parameters 
in Equation (1 1 .371) . Assuming the noise is Gaussian, so is y, and therefore the 
MLE is equivalent to the nonlinear least-squares estimator, which minimizes 

d(6o,0\, ao, oi\) = \y — doa(6o) + d\a(6i)\ 2 (1-39) 

over the unknown parameters. Because y is a linear function of the scale factors 
do, d\, the minimization over these parameters can be done analytically: 

= (S H (0o,0i)S(0o,0i)) _1 S H (0o,0i)y (1.40) 

where 

S(0 O > #t) = [a(0 O )> a(0i)] (1.41) 

Inserting the result into the preceding equation, we get 

^o,6>i) = |y-Ps(e 0 ,ei)yl 2 = |Ps(eo.e 1 )yl 2 = y Wp S(e 0 .e 1 )y ( L42 ) 

where Ps(0 o ,0i) denotes the projection operator on the subspace S(Go,G\). 
Similarly, Pg-^ denotes the orthogonal projection operator. The minimization 
of d(Go,0\) needs to be numerical and is generally computationally intensive, 
requiring a two-dimensional search. In general a “brute-force” search over a 
selected grid of values of (Go,G\) is necessary, followed by interpolation in 
the neighborhood of the minimum point to compute the final estimate. (See 
Section lA.1.9l for details regarding this interpolation step.) 

Figure ITTTT1 depicts the maximum likelihood cost function d(Go, 0\) for the 
case of two emitters: one at direction Go = —5° and the other at 0\ = 10°. The two 
signals have equal power. An 8-element uniformly spaced array was used with 
half-wavelength element spacing. The cost function has two minima at —5,10 
and 10,-5, which is to be expected because the cost function is invariant to the 
ordering of the emitters. This introduces the symmetry relationship d(Go,G\) = 
d(0i,0o). 


d o 
d\ 
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FIGURE 1.11 Maximum likelihood cost function (Equation 1 1 .421 for an 8-element uniformly 
spaced array and two emitters at 6q = —5° and 9\ = 10°. 


We considered estimating directions from a single snapshot of received data. 
If multiple snapshots are available, we need to distinguish between the case 
where the signals are correlated from snapshot to snapshot and the case where 
they are uncorrelated. In the uncorrelated case we have 

y[k] = ao [&]a(0o) + a\[k]a(0 \) + \[k ], k=l (1.43) 

where the scale factors ao[k],a\[k] vary randomly from snapshot to snapshot. 
It is straightforward to show that in this case the maximum likelihood estimator 
requires minimization of the cost function 

K 

d(0o, 0\) = y^y N [/c]Ps (ffoA) y[k] (1.44) 

k=l 


In other words, the cost function for the case of multiple snapshots is the sum of 
the cost functions for the individual snapshots. 

In the fully correlated case the scale factors ao[k], a\ [k] are fixed throughout 
the K snapshots so that the signal model becomes 


y[k] =aon(6o) +aqa(0i) + \[k] 


(1.45) 
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Clearly, the maximum likelihood estimator requires minimization of the cost 
function in this case: 


=y // Ps ( e 0 ,6> l) y (i-46) 

where 

1 K 

(1-47) 

A k=\ 

is the average of the received data vectors. In other words, the cost function 
for the case of multiple snapshots is the same as the cost function for a single 
snapshot, with the data vector replaced by the average vector. 


1.6.2 Other Direction-Finding Methods 

While the maximum likelihood estimator provides an optimal solution to the 
multi-emitter direction-finding problem, it is rarely, if ever, used in practice 
because of its computational complexity. Much research has been devoted 
to the development of alternative suboptimal but computationally feasible 
methods, most of which are variations on the so-called Multi-Signal Classi- 


SchmidAll979lll981 


fication (MUSIC) algo r ithm (Bienvenu and Kopp . 19831 : 

Weiss and Friedlanderi 1994 ). It is not our intention to provide a comprehen¬ 


i 


sive survey of these techniques, which would take a separate chapter. Instead, we 
present a brief summary of the basic MUSIC algorithm and compare it to the MLE. 
Consider the case in which P signals are impinging on the array: 


y\k\ = Ya p \k\ a (0 P )+v\kl k=\,...K 

p= i 


(1.48) 


where 6 p are the signal directions, a p [ k] are the complex amplitudes of the signals 
assumed to be uncorrelated from snapshot to snapshot, and v[k] is a vector of 
zero-mean unit-variance Gaussian noise. The covariance of the received signal 
vector Ry = E{y[k]y[k] H } is given by 

p 

Ry = ^a 2 a(0 p )a> p ) + I (1.49) 

P= 1 

where or =E{\a p \k\\ 2 } is the SNR of the /;lh signal. Let 

R y = U£U // (1.50) 

be the singular-value decomposition of the covariance matrix, where 

E = diag{[CT 2 + l,...,ap + l, 1,..., 1]} 


(1.51) 
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is a diagonal matrix with elements that are the singular values of K y . The matrix 
of singular vectors U can be partitioned as 

U=[U J ,U n ] (1.52) 

where the M xP matrix contains the singular vectors corresponding to P 
largest singular values, and the matrix XJ n contains the singular vectors corre¬ 
sponding to M — P smallest singular values. The range space of U? equals the 
space spanned by the signal 

n{XJ s } = [am,...,a(0p)] (1.53) 

so we refer to U s as the “signal subspace.” Similarly, XJ n is the “noise subspace.” 
Because U is a unitary matrix the signal and noise subspaces are orthogonal, 
so that 


uf u„=o 

It follows that 

a H (0)XJ n = 0 for 0 = 0 \,..., Op 


(1.54) 

(1.55) 


For this reason we compute 



1 

a H (6)U n \ 2 


(1.56) 


which will have peaks (infinitely high in the noise-free case) at the signal 
directions. 

In practice, the covariance matrix R v is not known. However, it can be 
estimated from the available data by forming the sample covariance matrix, 


K 

R v =I3 y [^] y [ fe C 

k=\ 


(1.57) 


The singular-value decomposition of the estimated covariance matrix R v = 

^ A A T T A 

UEU and the estimated noise subspace are obtained by partitioning U as 

/V /V /V " 

U = [U J .U„] . The estimated noise subspace is then used to compute the so-called 
“MUSIC spectrum”: 


S(P) = 


1 

a H (6)XJ n \ 2 


(1.58) 


Figure [TTT21 depicts the MUSIC spectrum for an 8-element uniformly spaced 
linear array with half-wavelength spacing, signal directions Go = 5°, 0\ = 10°, and 
SNRo = SNRi = 10 dB, K = 100. Note the sharp peaks at the emitter directions. 

The main advantage of MUSIC is computational: It requires only a one¬ 
dimensional search, whereas the MLE requires a P-dimensional search. Certain 
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Direction (degrees) 

FIGURE 1.1 2 MUSIC spectrum for an 8-element uniformly spaced linear array with emitters at 
directions 6q = 5° and 0\ = 10°, with SNRq = SNRj = 10 dB, using K = 100 snapshots. 


variations of MUSIC, such as Root-MUSIC, e l iminate the search altogether , 
replacing it with root solving (Friedlander, 19931 : Weiss and Friedlander . 1993 ). 


MUSIC is a suboptimal algorithm and generally is not statistically efficient. 
However, as the SNR of all th e signals approaches infinity, the error variance o f 
MUSIC approaches the CRB (iFriedlander . 1 1990l : Porat and Friedlanderl 1988 ). 
Furthermore, in most scenarios of practical interest its accuracy is close to 
optimal. 

The MUSIC algorithm and all of its many variations are based on the esti¬ 
mated covariance of the received data. To form a reliable estimate of R y , it is 
necessary to collect a sufficiently large number of snapshots—a simple rule of 
thumb is, at the very least, K > 3 M. However, to separate closely spaced signals a 
much larger number of snapshots may be required—typically tens or hundreds. 
This is in contrast to the MLE, which can operate, if necessary, with a single 
snapshot. 

The requirement of a large number of snapshots is generally disadvantageous. 
It slows down the response time of the system and limits the number of emitters 
that can be monitored given finite system resources. More important, the system 
may fail to estimate the directions of short-lived signals because collecting a 
sufficient number of snapshot is not possible. 





















































1.6 Direction Finding for Multiple Co-Channel Emitters 


1.6.3 Accuracy 

The maximum likelihood es 
( Cramer . I 95 ll: iFisheil 1 1 922k 


ima 


Rao 


or is known to be asymptotically efficient 


19651) . Therefore, at a high SNR its perfor¬ 


mance approaches the CRB. The CRB for the multiple emitter case is summarized 
in Section lA.1.31 

Figure ITTT31 depicts the standard deviation calculated using the CRB for the 
case of two emitters at directions Oq = A/2 and 0\ = — A/2, where A is the angular 
separation. An 8-element uniformly spaced array was used with half-wavelength 
element spacing and signal-to-noise ratios of SNRo = SNRi = 30 dB. The figure 
demonstrates the fact that the error variance increases sharply when emitter 
separation is smaller than a beamwidth (the 3-dB beamwidth is 13° in this case). 
When the separation exceeds the beamwidth, the error variance is essentially 
independent of it and is equal to the variance of the single-emitter case. 

Figure 11.141 compares the standard deviation of the maximum likelihood 
estimator, obtained via Monte Carlo simulation, to the CRB for two equal-power 
emitters at directions Go = 3° and 6 1 = 10°. As e xpected, the standa rd deviation 
of the MLE approaches the bound at a high SNR (Friedlander, 19891) but departs 
from it below a threshold SNR with value that is around 15 dB in this case. We 
note that the threshold SNR increases as the emitter separation decreases (not 
shown here). Thus, reliable direction finding for closely spaced signals is only 



FIGURE 1.13 CRB as a function of emitter separation for two equal-power signals that use an 
8-element uniformly spaced array. 
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FIGURE 1.14 Experimental and theoretical standard deviation of direction estimates for two 
equal-power emitters at 6q = 3° and 9\ = 10°, using an 8-element uniformly spaced array. 


feasible when the SNR is sufficients hi eh ( 

Weinstein and Weiss. 

1988: Weiss 

and Weinstein, 

1985 

). This important point will be discussed in more detail in 


the following section. 


1.6.4 Resolution 


Estimation accuracy is not sufficient to characterize the performance of the 
direction-finding system when multiple co-channel signals are present. We must 
also consider the issue of resol ution—the ability of the system to distinguish 
between tightly spaced emitters (lHelstroml.il 9551) . 

Resolution is closely related to determination of the number of distinct com¬ 
ponents in the composite signal received by the array. Whe reas accuracy is related 
to est imation, resolution is related to detection or decision (IKavUl998l : IVan Treesl . 
19681) . To estimate emitter directions it is necessary to determine the number 
of directions to be estimated. Applying an MLE designed for P signals to the 
received data will produce P direction estimates regardless of how many signals 
are present. Thus, a separate detection/decision step is needed to determine how 
many directions can be estimated meaningfully. 

Knowing the number of emitters present is not sufficient. Assume, for exam¬ 
ple, that we know that two signals are present: one strong (high SNR) and the 
other weak (low SNR). Given this knowledge we use the MLE introduced earlier 
to estimate two directions. The MLE will produce two estimates regardless of 
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how low the SNR of the weak signal may be. It is evident that one of them will 
not be meaningful if the SNR of the weak signal is too low. Thus, rather than how 
many signals are present, a better question would be how many signal directions 
can be reliably estimated given the available data. This is a subtle but important 
distinction: The first question presumes a unique “true” answer independent of 
system parameters; the second has different answers depending on the SNR, the 
separation between signals, and possibly some other factors. 

This question can be formulated more precisely as a hypothesis-testing or 
detection problem. For simplicity we consider only the case of two signals; the 
same idea can be extended to more signals, although the details become more 
complicated. Consider the following binary hypothesis-testing problem: 

Ho : y = aoa(0o)+v (1.59) 

H\\ y = Qqa(0i)+0^(02)+v (1.60) 


Given the data y we decide whether it was generated by a single signal with 
unknown direction Oo and unknown amplitude ao , or by two signals with 
unknown directions 0 \, 62 and unknown amplitudes oq, c^- 

Different detection techniques can be used to solve this problem. The gener¬ 
aliz ed likelihood ratio (GLR) detector provides one solution (see Section lA. 1.101 
and lFriedlanden.120091). Recently a Bayesian detecto r was used to obtain practical 
closed-form results (lAmar and WeissL 12007 


20081) . Given a particular scenario 


(directions, amplitudes, or signal-to-noise ratios), the detector can correctly 
determine which of the two hypotheses is true with some probability P c . We 
will say that the two signals are resolved if the probability of correct detection 
equals or exceeds some nominal value, say P c > 0.8. The resolution limit can 
now be defined as the angular separation A r between the signals for which P c 
equals its nominal value. 

A number of other attempts have been made to define the resolution limit. 
One is based on estimation accuracy. Assume two emitters at directions Oo 
and 0 \, where the directions are estimated with standard deviations o 0 and cr \, 
respectively. These standard deviations are so metimes approximated by the cor¬ 
responding values of the CRB. According to iLee and Wengrovitz ( 1990h the 
signals are resolvable if half of the separation is smaller than the largest of 
ao, o\. Although this approach provides a simple expression for the resolution 
limit, it ignores the correlation between the direction est imates , which increases 
as the separation decreases. To overcome this difficulty. ISmithl (120051) proposed 


estimating the angular separation (rather than the individual directions) and eval¬ 
uating the standard deviation of this estimate using the CRB. The signals are said 
to be resolvable if this standard deviation is smaller than the separation. 

A different approach to determining resolution is to co nsider a cost function 
that exhibits two pea ks at approximately the true directions (IKaveh and Barabell . 


19861 : iMarplell 19771) . When the separation decreases the two peaks merge into 


one. The resolution limit is defined as the separation at which the expected value 
of the cost function at the midpoint between the two peaks equals the expected 
values of the peaks. 
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Yet another approach is based on the performance of the GLR test for deter¬ 
mining the number of signals. This metho d has been used to determine the 
resolution of two radar signals jRoo l ll962h . 

A review of this literature shows that the resolution limit of the MUSIC 
algorithm normalized by the beamwidth is inversely proportional to the fourth 
root of the signal-to-noise ratio (for the case of two equal-power signals): 


A, 


1 


BW SNR? 


(1.61) 


This relationship indicates that super-resolution can only be achieved at high 
signal-to-noise ratios. For instance, resolving signals half a beamwidth apart 
requires a 6-dB higher SNR than is needed to resolve signals a full beamwidth 
apart. Similarly, resolution one-tenth of a beamwidth requires a 40-dB higher 
SNR than is needed for a one-beamwidth resolution. 


Amar and Weis si (120071) showed that a lower bound on the resolution limit 
normalized by the beamwidth is inversely proportional to the square root (rather 
than the fourth root) of the signal-to-noise ratio. A similar relationship of resolu¬ 
tion and SNR is observed when using the model order selection criteria known 
as the Akaike information criterion (AIC) a nd the minimum descrip tion length 
(MDL) to determine the number of signals (I Wax and KailathL 1 1 9851) . However, 


it appears that under most circumstances the practically achievable resolution 
limit is inversely proportional to the fourth root of the SNR. 


1.7 DISCUSSION 

Classical methods of direction finding using antenna arrays are optimal when the 
received signal has a single component—that is, a single emitter and no multipath. 
In fact, the beamformer can be interpreted as the generalized maximum likelihood 
direction estimator, as was shown earlier. In the case where the received signal 
has multiple components of which the directions are well separated (more than 
a beamwidth apart) the classical methods are capable of resolving the signal 
components and optimally estimating their directions. 

When using arrays with relatively large numbers of antennas and large 
apertures, the correspondingly small beamwidth generally provides sufficient 
resolving power to meet system requirements. Also, the probability of multi¬ 
ple co-channel emitters falling within a given beam decreases as the size of 
the beamwidth decreases, making it unlikely that more than one signal will be 
present at any given time. In such situations only classical methods are needed. 

The more challenging situation is when arrays with a small number of ele¬ 
ments and a small aperture must be used either because of cost considerations 
or because of physical constraints such as limited space to place the antennas 
in mobile and airborne applications. In this case, the beamwidth and hence the 
Rayleigh resolution are relatively large. Also, the larger the beamwidth, the 
higher the probability of having multiple co-channel emitters fall within a beam. 
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27 


As discussed earlier, the presence of multiple unresolved signal components 
introduces bias errors that limit the accuracy of the direction-finding system to 
some significant fraction of a beam width. These errors are a function of the 
angular distribution and relative powers of the co-channel signal components 
and are present even when the SNR is very high. It is desirable to reduce or 
eliminate them ]] 

High accuracy and high resolution with small aperture arrays and a 
small number of antennas are the “holy grails” of direction finding. Achiev¬ 
ing them has motivated extensive research on direction-finding methods 
that produce fraction-of-a-beamwidth resolution, or super-resolution. Since 
the 1980s many such algorithms have been developed and studied in the 
literature ( Bienvenu and Koppl Il983 : Schmidt, 1979 . 1981 ). Their perfor¬ 
mance characteristics are now fairly well understood. In addition to analy¬ 
sis and simulations work, substantial experimental work has been done in 
this area. 

In spite of extensive research and development over a substantial length of 
time and the demonstrated superiority of advanced super-resolution direction¬ 
finding methods, it appears that their practical use is limited. In the rest of this 
section we attempt to identify the possible reasons for this situation. 

Consider the following characteristics of super-resolution techniques: 

Relatively high computational complexity. Covariance-based methods such as 
MUSIC require significantly more computations than classical methods. The 
computational requirement of MLE and its variations increase exponentially 
in the number of simultaneously estimated directions. 

Accurate calibratio n. Required accuracy increases with desired resolution (T orat 
and Friedlander. 1 997l : I Weiss and Friedlanden. 1989allbh . Classical methods 
also require calibration but are more robust to errors. 

Relatively high costs. These are in part due to the two issues justly mentioned. 
Multiple snapshots. Covariance-based techniques such as MUSIC require a rel¬ 
atively large number of snapshots to produce direction estimates. Classical 
methods are able to operate with a single snapshot but can fully benefit from 
multiple snapshots when these are available. 

High SNR. The SNR required by MUSIC and its variations is inversely 
proportional to the fourth power of the desired resolution limit. 

The first three listed items are not fundamental limitations, and they are 
amenable to technological solutions. The ever increasing power and decreasing 
cost of processors makes it possible to implement increasingly more sophisti¬ 
cated and computationally demanding algorithms over time. The requirement 
of multiple snapshots is more problematic but can, in principle, be eliminated 


1. The inability to resolve signals is, of course, a serious problem when attempting to classify or 
copy them. However, here we focus only on the direction-finding aspect of the problem. 
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with use of the MLE, at least for a (very) small number of signal components. 
However, the requirement for a high SNR is fundamental and cannot be “solved.” 

To better illustrate the SNR issue, we consider the following example. 
Figure ITTT51 depicts the mean-square error of the beamformer for two emitters: 
one at direction Go = — 1° and the other at 0\ = 1°. The two signals have equal 
power. An 8-element uniformly spaced array was used with half-wavelength 
element spacing, and K = 100 snapshots were used to produce each estimate. 
The beamwidth of this array is approximately 13°, so the emitters are about 
15% of a beamwidth apart. The beamformer cannot resolve the two emitters 
and instead estimates the midpoint between the two directions. The threshold 
SNR of the beamformer is seen to be approximately SNR ^ = —15 dB. This is 
consistent with the earlier example in Figures fT31 and ITT61 where the threshold 
was observed around 5 dB using a single snapshot. K = 100 snapshots provides 
an additional 20-dB processing gain, which reduces the threshold to —15 dB. 

Next we apply the MUSIC algorithm to the same data set. MUSIC resolves the 
two signals with some probability P r . Figure [FT6] depicts resolution probability 
as a function of SNR. We define (somewhat arbitrarily) the threshold SNR of 
the MUSIC algorithm to be that corresponding to P r — 0.8, which is seen to be 
at SNR mM ,; c = 20dB. In this case we have a large difference between the two 



SNR (dB) 

FIGURE 1.15 MSE of the direction estimate produced by a beamformer for two equal-power 
emitters at 6q = —1° and 6\ = 1°, K= 100 snapshots, and an 8-element uniformly spaced linear 
array. The SNR threshold is around —15 dB. 
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FIGURE 1.1 6 Resolution probability for two signals using the MUSIC algorithm for two equal- 
power emitters at 6q = — 1° and 0\ = 1°, K= 100 snapshots, and an 8-element uniformly spaced 
linear array. 


thresholds: SNR mus i c — SNR/?/ = 35 dB. Various techniques have bee n proposed 
to reduce the SNR threshold of MUSIC ( Lee and Wengrovitz . 1990h so that the 
actual difference may be smaller than just indicated. However, the point here 
is that there is a very substantial difference between the two threshold SNRs in 
general. 

Thus, we have the following situation. If the SNR is lower than SNR/?/, the 
emitter directions cannot be reliably estimated by any method. If the SNR is 
higher than SNR/?/ but lower than SNR music, the emitters cannot be resolved but 
their general direction (in this case the direction of the midpoint) can be reliably 
estimated. If the SNR is higher than SNR mus ic, the emitters are reliably resolved 
by the MUSIC algorithm. It is only in the latter case that MUSIC provides a 
significant advantage over the beamformer. More generally, any super-resolution 
technique will have some threshold SNR below which it cannot resolve the two 
emitters. 

We refer to emitters with an SNR higher than SNR/?/ as estimable and emitters 
with an SNR higher than that required by whatever super-resolution technique is 
being used as resolvable. In any real-world scenario the direction-finding system 
will receive signals of interest from some emitters that are resolvable, some that 
are estimable but not resolvable, and some that are not estimable. (We ignore 
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the latter class of emitters for now.) An important question is what fraction 
of the observed emitters are resolvable. This is the fraction of cases for which 
the super-resolution methods are effective. The answer will depend, of course, 
on the parameters of the direction-finding system and the specific surveillance 
scenario, so it is not possible to answer this question in general. However, it is 
likely that in many direction-finding applications this fraction will not be large. 
The SNR of signals of interest received by the array will have some distribution, 
which will often have more low and moderate SNR signals (corresponding to 
longer ranges, weaker transmitters, emitters obscured by hills or buildings, and 
the like) than high-SNR emitters (relatively nearby, high-power transmitters, 
line-of-sight propagation). Generally it is desirable to estimate the directions of 
all signals present, not just those with a high SNR. Thus, only a fraction of the 
signals will be resolvable. 

To better illustrate this point we simulated the following “toy” scenario: Five 
emitters are randomly distributed in angle and their SNRs are randomly distributed 
between —15 dB and 30 dB. An 8-element uniformly spaced array is used to col¬ 
lect K = 100 snapshots. The MUSIC algorithm and the beamformer estimate 
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FIGURE 1.1 7 Histogram of the number of signals resolved by MUSIC (a) and by a beamformer 
(b) for a random scenario with five emitters using an 8-element uniformly spaced array with K = 100 
snapshots, based on 1000 random experiments, —15 < SNR < 30. 
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FIGURE 1.18 Histogram of the number of signals resolved by MUSIC (a) and by a beamformer 
(b) for a random scenario with two emitters using an 8-element uniformly spaced array with K = 100 
snapshots, based on 1000 random experiments. 


directions, and the number of signals resolved by each algorithm is recorded. 
This experiment is repeated for N = 1000 randomly chosen scenarios (angles and 
SNRs). The histogram of the number of resolved signals is depicted in Figure fTTT 
Figure 11.181 depicts the results of the same experiment except that in this case 
only two emitters are present. Figure fLl9l shows the results of the experiment in 
Figure fFTTl except that the SNRs are distributed between —15 dB and 15 dB. 

As is to be expected, MUSIC resolves more signals than the beamformer does. 
The beamformer will only resolve signals separated by more than a beamwidth, 
while MUSIC will resolve signals with much smaller separation. However, the 
difference is not as decisive as one may expect and it depends on the parameters 
of the scenario—for example, the number of emitters and the SNR distribution. 
Whether or not the difference is sufficient to justify using a super-resolution 
method with its associated costs is a subject to be determined by the system 
designer. 

We rarely, if ever, find in the literature an evaluation of the relative perfor¬ 
mance of classical and advanced direction-finding methods in the context of 
























Wireless Direction-Finding Fundamentals 



(a) 



(b) 


FIGURE 1.19 Histogram of the number of signals resolved by MUSIC (a) and by a beamformer 
(b) for a random scenario with five emitters using an 8-element uniformly spaced array with K = 100 
snapshots, based on 1000 random experiments, —15 < SNR <10. 


a direction-finding scenario with realistic mixtures of emitters with different 
directions and powers. Results are usually presented for one or a few selected 
scenarios—most frequently for two equal-power emitters. These results are 
certainly important and instructive, but they may create unrealistic expectations 
about the performance of super-resolution methods in real-world deployments. 


APPENDIX 

A.1.1 The Array Manifold and Its Derivatives 

The array manifold a (0) plays a key role in the performance of direction-finding 
systems. Here we present the form of the manifolds for linear and circular 
arrays, and we calculate their first- and second-order derivatives, which are 
needed to compute the accuracy of direction estimation achieved by different 
algorithms. 

For simplicity we assume that the antennas and all the emitters are in the 
same plane, so that the array manifold can be characterized by azimuth only. 














Appendix 


This can be extended in a straightforward manner to the three-dimensional case, 
where the manifold is a function of both azimuth and elevation. We consider the 
far-held narrowband model discussed earlier. 


Linear Array 

Consider a linear array with M omnidirectional antenna elements the distance of 
which is d m from a common reference point in the array. Let 0 be the direction 
measured from the line perpendicular to the array. The array manifold is given by 



exp {j2nd\ sin 0 /k } 
exp {j 2 Ttd 2 sin 0 /X} 


exp {j2ndM sin 0 /X] 


(1.62) 


Note that |a(0) | 2 —M. If the antenna elements are directional and have identical 
patterns g(0), the corresponding array manifold is obtained by multiplying the 
elements of a (6) in Equation d 1.621) by g(0). 

The first derivative of the array manifold is given by 



d 

dO 


a (0) = j(2n/X) cos0Da(0) 


(1.63) 


where 


D = diag{([Ji, J 2 , (1.64) 


is a diagonal matrix containing the element locations on the diagonal. Note that 

M 

a(0) H a(0) = j(2 jt/X) cos 6 d m (1.65) 

m= 1 

Without loss of generality the distances d m can be defined relative to the phase 
center of the array, in which case Ylm =l = 0, and therefore 

2l(0) H 2l(0) = 0 ( 1 . 66 ) 


Note also that 

real{a(6>) // a(6>)} = 0 (1.67) 

even if the distances d m are not defined relative to the phase center of the array. 
The norm of a(0) is given by 

\a(0)\ 2 =a(0) H a(d) = (2n/X) 2 cos 2 (1.68) 


where 


M 

d2 = J2 d ™ 

m= 1 


(1.69) 
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is the array’s “moment of inertia.” This variable plays a key role in the 
performance of the direction-finding system. 

The second derivative of the array manifold is given by 

a (0) = — (2n/ X ) 2 cos 2 6 D 2 a( 6 ) —j(2n/X) cos Osin 6 Da( 6 ) (1.70) 

It follows that 

M 

a{0) H a{0) = — {2n/X ) 2 cos 2 0 cP-j(2n/X) cos6> sin 0^£ j d m (1.71) 

772=1 

The second term is zero, assuming that Ylm =l dm = 0, so that 

a(0) H a(0) = — (2 tt / k ) 2 cos 2 0~<fc (1.72) 

Note that 

real {a (0) 77 a (0)} = — (2n/k ) 2 cos 2 o'd 2 (1.73) 

even if we do not assume that J 2 m=\ dm = 0. 


Circular Array 

Consider a circular array with radius R and M antenna elements. For omnidirec¬ 
tional elements the array manifold is given by 



exp { j2nR sin (0 — 0 \) / k} 
exp { J2txR sin(0 — 02) A} 


exp { ] 2 nR sin (0 — $m) A} 


(1.74) 


where 6 m are the element angles relative to the center of the circle. Note that 
\a(0)\ 2 =M. 

For elements with identical directional patterns g(0) oriented so that the 
antenna boresight is along the radial line from the circle center to the antenna 
location, we have 



g(0 — 6 1 ) exp { j2nR sin(0 — 0\)/k} 
g{0 — 02) exp { j2uR sin(0 — $ 2 ) A} 


g(0-0 M ) exp { j2nR sin (0-0 M )/k} 


(1.75) 


In the following we consider only arrays with omnidirectional elements. 

The first derivative of the array manifold is given by 

m = =j(2irR/\)C{0)m (1.76) 

dO 

where 


C ( 0 ) =diag{([cos(0 — 0 \), cos (0 — 62 ),..., cos (0 — 0m)])} 


(1.77) 
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is a diagonal matrix. Note that 


M 

a(6) H a(6) = j(2tiR/X) cos (6 - 0 m ) 

m= 1 


(1.78) 


A circular array with uniformly spaced elements has its phase center at the center 
of the circle. In other words, if 6 m = 2nm/M + 6$, then J2m=i cos ($ — 6 m ) = 0- 
Therefore, a(6) H a(6) = 0. Also note that 

real{a(6>) // a(6>)} = 0 (1.79) 

for any circular array. 

The norm of a (6) is 

|a(6»)| 2 =a(0) H a(0) = (2 ttR/X) 2 c^(9) (1.80) 

where 

M 

c I (6) = ^2cos 2 (6-6 m ) (1.81) 

m= 1 

This variable plays a key role in the performance of the direction-finding system. 
The second derivative of the array manifold is given by 

a(6) = — (2nR/X) 2 Q 2 (6)a{6) 

—j(2TtR/X)dmg{([cos(0 — 0i)sin(0 — 0i ),..., (1.82) 

cos(0 — 6m) sin(0 — 6m)])} a(0) 


a(0) H a(0) = — (2nR/X) 2 c 2 (9) —j{2nR/X) cos(# — 0 m ) sin(# — 0 m ) (1.83) 

m =1 

For a uniformly spaced circular array, J2m=i cos ($ — @m) sin(0 — 6 m ) = 0, so that 

a(6) H a(6) = — (2j tR/X) 2 ^(6) (1.84) 

Note that 

real {a (0) 77 a (0)} = -(2ttR/X) 2 c^(6) (1.85) 

for any circular array, whether or not it is uniformly spaced. 

A.1.2 The Cramer-Rao Bound 

The Cramer-Rao bound (CRB) on the variance of direction estimation errors 
provides a useful characterization of the achievable accuracy of DF systems. In 
this section we derive the CRB for linear and circular arrays. 
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Let y be a multivariate complex Gaussian vector with zero mean and covari¬ 
ance matrix K y . Assume that R y depends on a single unknown parameter 0. Then 
the Fisher information matrix (I Bane si Il97ll) (a scalar in this case) is given by 


F = trace 


l3Ry p-l 9R V 

rv. -rv. - 

do y do 


( 1 . 86 ) 


where 


Ry = SNRa(<9)a(0) ff +I (1-87) 

Given K independent measurements y[l],..., y [K], the CRB is given by 

1 


CRB = -F 
K 


( 1 . 88 ) 


To calculate the CRB note that 


9R 


y 


= SNR a(0)a(0) H + SNR a(<9)a(<9) 


de 


H 


d-89) 


and therefore 


F = SNR z trace{(R7 1 a(0)a(0)'’ + Rr 1 a(6>)a(<9) , ’)(Rr 1 a(6>)a(6>)'’ (1.90) 


M , r»-l 


H\, r»-l 




y 


y 


y 


+R; 1 a(tf)a(<)} 


or 


F = SNR Z [(a(#) H R“ 1 a(#)) z + 2(a(#)'’R“ 1 a(0))(a(#)'’R“ 1 a(0)) (1.91) 

+ (a(0)R; 1 a(0)) 2 ] 

The first and third terms are zero because 


M r»-l 


Mr>-\ 


a(0) H R- l a(6)=a(0) n a(0) 


H. 


y 


1 


(a(0/*a(0)) (a(0)" a(0)) = 0 


H. 


(1.92) 


1 /SNR + a{0) H a{9) 


since a{0) H a{0) = 0, as was shown earlier. Similarly, 


a(9) H R~ l a(9) = ( a(9) n R~ l a(9)) n =0 




(1-93) 


Equation (11.92b follows from Equation (11.9111 using the well-known matrix 
identity 

(.A + BCD ) _1 =A~ l —A~ l B(C~ l + DA~ l B)~ x DA~ l 


(1.94) 


from which it follows that 


r;'=i 


1 


We conclude that 


1 /SNR + sl(0) h 2l(0) 


F = 2SNR Z (a(9) H R- l a(9))(a(9) n R- L a(9)) 


mm 


H 


(1-95) 


.fln-l 


(1-96) 
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Using Equation (11.95 ) 


a(9) H R^ l a(9) = a(0) H a(0) 


1 


1/SNR + a(d) H a(6) 


(a(< a(0))(a(6) H a(9)) 


(1.97) 


or 


a(9) H R~ l a(9) = \a(9)\ 2 - 


|a(0)r 


1 


|a(0)| : 


l/SNR + |a(6>)| 2 SNR 1/SNR+ |a(6»)| 2 


(1-98) 


and 


a(9) H R~ 1 a(9)=a(9) H a(9) 


1 


(a(9) H a(9)){a(9) n a(9)) 


H. 


(1-99) 


l/SNR + a(6>) w a(6>) 
The second term is zero, so 


a(6») H Rr 1 a(0) = |a(6»)| 


V 


( 1 . 100 ) 


Finally, 


F = 2SNR|a(<9)| : 


|a(0)| : 


1/SNR + |a(0)| 2 


( 1 . 101 ) 


Note that for arrays with omnidirectional antennas, we have |a(0)| 2 =M, and 
therefore 


|a(0)| : 


MSNR 


l/SNR + |a(6>)| 2 MSNR + 1 
If M SNR 1, then & 1, and therefore 


( 1 . 102 ) 


F^2SNR|a(<9)| : 


(1.103) 


The CRB is then given by 


CRB 


1 




2A r SNR|a(0)| : 


(1.104) 


In other words, the CRB is inversely proportional to the number of snapshots K , 
the SNR, and the squared norm of a (0). This norm was evaluated in the previous 
section for linear and circular arrays. Thus, for a linear array we have (11.681) . 


CRB 


1 




and for a circular array (11.801) 

CRB 


2KSNR(2tt/X) 2 cos 2 Od 2 


1 


(1.105) 




2KSNR(2jiR/\) 2 c 2 (9) 


(1.106) 
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A.1.3 The Cramer-Rao Bound for Multiple Signals 

Let y be a multivariate complex Gaussian vector with zero mean and covari¬ 
ance matrix R v . Assume that R v depends on a vector of un known parameters 
9 \,..., Op. Then the Fisher information matrix ( Fisher . 1 922l) is given by 


F m,n= trace 


r-I 9R >'r-I 9 r v 


7 dO m y dO 


n 


(1.107) 


where 


R v = J2 SNR p a(0 p )a(0 p ) H + 1 

P= 1 

Given K measurements y[l],..., y [K], the CRB is given by 

1 , 

CRB = —F -1 
K 


(1.108) 


(1.109) 


where F is a P xP matrix with (m, n )-th element that is F mjl . To calculate the 
CRB note that 


dR 


y 


= SNR m a(0 m )a(0 m ) H + SNR m a(0 m )a(Q m ) 


do 


H 


( 1 . 110 ) 


m 


The CRB just presented assumes that the SNRs of the different signals are 
known. It is of interest to consider the case where these SNRs are unknown 
and need to be jointly estimated with the directions 6 P . In this case the Fisher 
information matrix will consist of four P xP blocks: 


F = 


F(9,6> F^SNR 
_FsNR,6> FsnR,SNR 

where the elements of F^ are (11.1071) 


[Fe,o\m,n = trace 
The elements of F$nr,snr are 

[FsNR,SNR]m,f 2 = trace < 
The elements of F#,snr are 

[F#,sNR]m,rc — trace 
and the elements of Fsnr, 6> are 


[FsNR,6>]m,rc = trace 


i3R y iSRy 

rv.. -R. - 

■ v 3 e m y do. 


n 


R 


1 9Ry R 9Ry 


7 3SNR m y 9SNR 


n 


R -l3Ry R -l 3R,v 

^ d0 m y 9SNR„J 


R 


9R V 19R^ 

R, 


-1 n-l u *'y 


7 3SNR m y do 


n 


(1.111) 


( 1 . 112 ) 


(1.113) 


(1.114) 


(1.115) 
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To calculate these entries note that 

9Ry 

9SNR m 


a(0 m )a(0 m ) H 


The derivative 


9R, 


d6 


m 


is given in (11.1101) . 


(1.116) 


A.1.4 Sensitivity Analysis 

Various methods have been developed for estimating the direction of an emitter 
from measurements of the array outputs. It is of interest to study the sensitivity of 
these estimates to measurement errors due to noise. In this section we derive some 
basic expressions for this sensitivity. In the following sections these expressions 
will be used to evaluate the estimation error variance for three specific direction¬ 
finding methods. 

Assume that a parameter 0 is estimated by maximizing (or minimizing) a 

known function/(0). In the absence of noise, G = Go is the value that maximizes 

(or minimizes) that function. Therefore, the derivative of f(G) is zero at Go (be., 
• • 

f(9 o) =0). Expanding/(0) in a first-order Taylor series, we obtain 

m=m)+f(o o )(o-0o) (i.ii7) 

Because/((?o) =0, 

m=f(OoKO-e o) (1.118) 

In the absence of noise the value of 0 for which f(0) = 0 is Go. In the presence of 
noise we have/(0) instead of f(0), so that 

hQ=m + e=f(Oo)(0-Oo) + e = 0 (1.119) 


where e is the noise (error) term. The value of 0 optimizing f (6) is then 

d = G 0 + ^— ( 1 . 120 ) 

fm 

The estimation error variance is therefore given by 


E{\9-0q\ 2 } = 


1 


im>i : 


■E{\e\ 2 } 


( 1 . 121 ) 


This analysis is valid only if the errors are sufficiently small, because the first- 
order Taylor series expansion is a good approximation only for small deviations 
from Gq. 

Next consider the case where G is estimated by the value for which some 
function f(G) =0. Then 


m=fm+m)(o-oo) a.122) 


Because/($o) = 0, we have 


f(0)=f(G 0 )(G-G 0 ) 


(1.123) 
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In the presence of noise we have/(0) instead of f(0), so that 


m =m+e=m)(o-oo)+€ 


(1.124) 


Solving for the value of 0 , which makes f(0) = 0, we get 


6 —6q=- 


fm 


(1.125) 


The estimation error variance is given by 


E{\6-6o\ 2 } = ^ 


1 


\fm\- 


E{\€ I 2 } 


(1.126) 


As before, this analysis is only valid for small errors (i.e., high signal-to-noise 
ratio). 

Equations (11.1211) and (11.1261) show that the estimation error variance is a 
simple function of the derivative of the cost function/ (0) and the variance of the 
measurement error. 


A.1.5 Calculation of E{\e\ 2 } for a Linear Combiner 

Direction-finding systems such as the beamformer use a linear combination of 
the array output to obtain information about emitter direction. In this section 
we analyze the variance of the measurement error involved in the direction 
estimation for a general linear combiner characterized by a weight vector W (0 ). 
For convenience we assume that the weight vector is normalized to have unit 
norm |W(0)| 2 = 1. 

These weights are applied to the measurements y at the array output 

(Equation [13 to yield 

W H (6) y = VSNRW w (6) a (0 O ) + (9) v (1.127) 

g(0) g(0) u{6) 

Because the weight vector is normalized, we have 

E{ | u(0) \ 2 } = W H (6) E{ \\ H ) W(6>) = 1 (1.128) 

I 

Consider the case where 0 is estimated by evaluating the magnitude of the linear 
combiner output, which we denote f(Q) in the noise-free case and f(0) in the 
noisy case. Thus, 


f(0)=\ g m 2 


( 1 . 129 ) 
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and 


f(0) = [g(e) + u(9)][g(0) + um* 

= g(9)g*(0) +g(9)u*(9) +g*(9)u(9)+u(9)u*(9) 

m 


(1.130) 


m =f(9)+g(9)u*(9) + g*(9)u(9) + ii(9)u* (9) + g(9)ii* (9) + g* {9)U{9) + u{9)U* {9) 


* 


* , 




* 




e 


(1.131) 


Under the assumption that u(0), u*(6), u(0), u*(6) are uncorrelated random 
variables, it is straightforward to check that the expected value of the cross- 
product of any two of the six terms that e comprises is zero. Therefore, 


E{\e\ 2 } = 2\g(9)\ 2 E{\u(9)\ 2 } + 2\g(9)\ 2 E{\m\ 2 } 
+ 2E[\u(9)\ 2 }E{\um 2 ) 


(1.132) 


or 

E{\e\ 2 } =2\g(0)\ 2 +2\g{6)\ 2 E{\W (8)\ 2 ) +2E{\W (6)\ 2 ) 

Inserting the expressions for g(0) and g(6), we get 

£{|e| 2 } =2SNR |W w (6>)a(6» 0 )| 2 

+ 2(SNR|W H (6»)a(6» o )| 2 + l)£{|W(0)| 2 ) 


(1.133) 


(1.134) 


For a high SNR this equation can be simplified to 

£{|e| 2 )%2SNR|W w (6»)a(6»o)| 2 + 2SNR|W // (0)a(0 o )| 2 £{|W(6»)| 2 } (1.135) 

Note that u(6), u*(0) are uncorrelated because v and consequently u(0) are cir¬ 
cularly symmetric complex Gaussian random variables. For the same reason 
u*{6) are uncorrelated, as are u(6), and u*(0). The correlation 
of u{6) and u(0) is given by 

E{u(9)u*(0)} = W h (6>)W(6>) (1.136) 

These will be uncorrelated if W /7 (0)W(0) =0. The cross-term involving 
u(6)u*(6) appears in conjunction with a term involving u*(0)ii(0). Therefore, the 
condition required for u(6) and u(6) is that realjW^ (0) W(0)} = 0. This condition 
will always be satisfied in the cases discussed here. 

Next consider the case where 0 is estimated by evaluating the real part 
of the linear combiner output rather than its magnitude. In this case f(0) = 
reallW^C^jy} and 

/(<9) =real{W H (0)y} = VSNR real {\\ H (9)a(%)}+ real {W H (d)\} (1.137) 

Vs. ^ ^ 

m € 
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The variance of the error is one-half of the variance of the complex random 
variable W^(0)v, which has unit variance. Therefore, 


£{ie|2} 4 


(1.138) 


A.1.6 Estimation Error Variance for Beamforming 


In this section we derive the estimation error variance for a direction-finding 
system that uses a beamformer and estimates direction by the angle at which 
maximum power is observed. We consider the case where the beamformer weight 
vector is the nonwindowed normalized array manifold: 


W(0) = 


m 

\m\ 


(1.139) 


To evaluate the estimation error variance we use Equation (11.1211) . which requires 
calculation off ( 0 ) and E{|e \ 2 }. First we use Equation (11.1351) to evaluate E{|e | 2 }. 
Note that 


W H (0)a(0 o ) = 


|a(0)| 


= 0 


(1.140) 


Therefore, 


^{kl }-2SNR 


a H Wa(do) 

|a(0)| 


m 

|a(0)| 


= 2SNR |a(0)| 2 (1.141) 


In this case, 


m=\gm*= sNR(w H (0)a(0 O ))(w- mm) 


tH 




(1.142) 


The first derivative of f (6) is given by 

/(0) = SNR(W H (0)a(0 o ))(W H (0)a(0 o )) 


+ SNR (0)a(0o) (W n (6»)a(^ 0 )) 

The second derivative of f(0) is then given by 

f(0) = 2 SNR real} (W H mm) (W H (0)a(0 o ))*} 


xtH 


* 




(1.143) 


+ 2 SNR (W H (0) a (0o))(W n (0)a(0 o ))* 

• TJ 

Since W (0)a(0o) = 0, the second term is zero and we are left with 

m = 2 SNR real { (a H (0)a(0 o ))} 

The direction estimation error variance is then given by 


xtH 


(1.144) 


(1.145) 


E{\o-o 0 n =^ 


1 


\/m\- 


■E{\e\ z } = 


|a(0)| : 


2SNR(real{a H (0)a(0 o ))) 2 


(1.146) 
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This result is for an estimate based on a single snapshot. If K snapshots are used, 
the estimated directions can be averaged, reducing the variance by a factor of K. 
Thus, 


E{\0-9 O \ Z } = 


|a(0)| : 


2 K SNR (real {a H (6) a(<9 0 )}) :: 


(1.147) 


Using Equations (II ,68li and (1 1 .73l> we conclude that for a linear array 


E{\e-e 0 \ 2 } = 


i 


2K SNR(27r/A) 2 cos 2 Od 2 


(1.148) 


Using Equations dl.801) and 41.85b we conclude that for a circular array 


£{|0-0ol 2 } =- - -(1-149) 

2K SNR(27t/?/A) 2 c 2 (<9) 

These results are identical to the corresponding CRB Equations (11.1051) and 

(O06l) . 


A.1.7 Estimation Error Variance for Phase Matching 

A common direction-finding method involves measuring the phases of the signals 
arriving at the array and matching them to the “phase manifold”—that is, the 
phases of the array manifold a(0). In this section we evaluate the estimation error 
of this method for a linear and circular array. 

Linear Array 

The model of the measured phases </> m is given by 


01 


d\ 


U\ 


02 

= ( 2tt/X ) 

d2 

sin0 + 

U2 

(1.150) 

_0m _ 


_d m _ 


_u m _ 



where ( 2Tt/X)d m sin# are the expected phases for an emitter at direction 6 , and 
u m are the phase measurement errors assumed to be complex Gaussian with zero 
mean and variance E{u 2 n ) = a 2 . 

The estimated direction is obtained by minimizing the distance between the 
measured phase vector and the expected phase vector. This is accomplished by 
the following least-squares solution of Equation (11.1501) : 

M 

(j) m d m = sin Od 2 + u 

m= 1 


(1.151) 
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where d 2 = J2 m =i as defined in Equation (11.691) . and the error u , 

M 

u = ^ ' d m u m 

m= 1 

is a zero-mean complex Gaussian random variable with variance 

M 


(1.152) 


E{u 2 } = d 2 n a 2 = d 2 cr x 


(1.153) 


m= 1 


Solving for the sine of the direction we have 

1 ^md] 


sin# = 


(2n/k)d 2 


+e 


(1.154) 




sin^o 


where the error term is 


u 


e = 


(1.155) 


<T 


(2n/k)d 2 

The variance of this error is given by 

? E{u 2 } 

E{e 2 } = - _ --= (1.156) 

(2 n/X) 2 (d 2 ) 2 (2 n/X) 2 d 2 

Note that this error induces a direction error that can be identified from the Taylor 
expansion 

/V /V 

sin 0 = sin Oq + cos Oq (0 — Go) 


(1.157) 


e 


Therefore, 


a E{e 2 } 
E{(6-6 0 ) 2 } = 


and 


E{(0-Oo) z } = 


cos 2 Oq 




(1.158) 


(1.159) 


(27r/A) 2 cos 2 Od 2 

It remains to relate the phase measurement errors u m to the signal measurement 
errors v m (the elements of v in Equation 1 1.71) . The two errors are related by 

VSNRe- ,( ^" + “ m) = VSNR<?^"< +v m (1.160) 

where 0 m = (27t/k)d m sin0. This nonlinear relationship can be approximated at 
a high signal-to-noise ratio by the following linear relationship: 


VSNRe- /( ^ m+M " ,) % VSNRe^"'(l + ju m ) 

= VSNR (e^ m +e j( ^ m+7T/2) u m ) 


(1.161) 
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Comparing this to the signal model, we see that 

v m = VSNR e j( * m+7T/2) u m (1.162) 


and therefore 


cr 2 =E{u 2 } = 


1 


2 SNR 


(1.163) 


where the factor of 2 is due to the fact that u m is real while v m is complex. We 
conclude that 


E{{9-6 o) 2 ) = 


1 

2 SNR(27r/A) 2 cos 2 9d^ 


(1.164) 


This result is for an estimate based on a single snapshot. If K snapshots are 
available, the estimates will be averaged, yielding a /Gfold variance reduction, 
which result is identical to the CRB for the linear array. 


Circular Array 

The model of the measured phases 0 m is given by 


01 


sin(0 — 6\) 


U\ 


02 

= (2 nR/X) 

sin( 0 - 02 ) 

+ 

U2 

(1.165) 

_0m _ 


_sin(0 — 6 m )_ 


_u m _ 



where (2irR/X) sin(0 — 6 m ) are the expected phases for an emitter at direction 0 , 
and u m are the phase measurement errors assumed to be complex Gaussian with 
zero mean and variance E{u 2 n ) = a 2 . 

The estimated direction is obtained by minimizing the distance between the 
measured phase vector and the expected phase vector. This involves solving 
a nonlinear least-squares problem. To analyze the error variance we linearize 
these equations by expanding the phase in a Taylor series around the emitter 
direction Oq: 


01 


sin(6> 0 -6>i) 


cos (0o — $ 1 ) 

02 

= (2 nR/X) 

sin( 0 o -# 2 ) 

+ (2 nR/X) 

cos (0o — $ 2 ) 

_0M_ 


_sin(6> 0 —%)_ 


_cos(0o - 9m)_ 


(9-Go) (1.166) 


Comparing Equations (11.165b and (11.166b we see that 


U\ 


cos(0o — 0\) 



U2 

= (2 ttR/X) 

cos(^ 0 @ 2 ) 

(o-e 0 ) 

(1.167) 

Um_ 


_cos(^o — 0m)_ 
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We can now solve a linear least-squares problem for the estimated direction 6 to 
obtain 


M M 

u= T>s(flb —e m )n m — (0 — Oo)(27tR/X) cos 2 ($o — 0 m ) (1.168) 

m =1 m =1 


or 

u 

(0 - Q 0 ) =-=- 

(27tR/X)c 2 (0 0 ) 

Note that u has zero mean and variance 

M 

E\u\ 2 = ^^cos 2 (0o — 0 m )cr 2 =c 2 (Oo)a 2 

m= 1 


Therefore, 


E{0-0 0 ) 2 } = 


E\u\ 2 

(2nR/X) 2 (^(e 0 )) 2 


(2 7tR/X) 2 c 2 (6 0 ) 


(1.169) 


(1.170) 


(1.171) 


To relate the phase measurement errors to the signal measurement errors we 
proceed exactly as for the case of the linear array to get 


2SNR 


(1.172) 


and therefore 


E{0-O o ) 2 } = --=- 

2SNR (2ttR / A) 2 c 2 (0 o ) 


(1.173) 


As before, this result is for an estimate based on a single snapshot. If K snap¬ 
shots are available, the estimates will be averaged, yielding a 7f-fold variance 
reduction, which result is identical to the CRB for the circular array. 


A.1.8 Estimation Error Variance for Monopulse 

It was shown in Equations (11.12b . dl.l3b . and (11.14b . that the response b(0) 
of the monopulse system can be approximated by the first derivative p(0) of 
the beamformer response p(0). This approximation holds for a range of angles 
around the point where p(0)= 0; it can therefore be used for computing the error 
variance provided that the errors are sufficiently small (i.e., in the high SNR 
region). 

The direction estimate of the monopulse system is the direction 6 for which 
b(0)=p(0)= 0. This is mathematically identical to the value of 0, for which p(0) 
is maximized. In other words, it is identical to the direction estimate generated 
by the beamformer. 
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To be more specific, following Equation (11.1261) . the direction error variance 
is given by 


E {\o-o 0 n= 


1 


■E{\€\ 2 } 


(1.174) 


\p(Oo )\ 2 

where we substituted/^) = p(G). This equation is identical to Equation (11.121 ). 

The calculation of the error variance follows from Equations (11.1301) and 
(11.131b . where in this case we substitute/(0) = b(0),f(6) = p(G ), and their noisy 

versions f(0)=b(0),f(0)=p(0). The rest of the calculations are identical to 
those in Section lA. 1.61 to finally yield 


E{\0-0o\ 2 } = 


|a(0 o )l : 


2SNRreal{atf(6> 0 )a(6>o)} : 


(1.175) 


Inserting the appropriate expression for the derivatives of the array manifolds 
of linear and circular arrays, we get the results presented in Equations (II . 148b 
and (11.1491) . These results are identical to the corresponding CRB expressions 


(11.1 05 1 and 11.1 06 ). 


A.1.9 Quadratic Interpolation for the MLE 

Assume that we have computed values of a two-dimensional function d(-, •) in 
the neighborhood of a given point (0o, 0 1 ). Specifically, we have five values of 
the function: d(0o,0i), d(0o± A,0i), and d(0o,0i± A) for some step size A. 
We further assume that the peak of this function occurs in this neighborhood and 
we want to calculate the exact peak location. 

Let 


0o = 0o + 0o, 6i=6i+0i (1.176) 


be a point in this neighborhood. Expanding the function in a first-order Taylor 
series around (Go, G\), we get 

d(Go , 0i) = d(Go, G\) -\-b\Go + /?20i + ^30 q + ^50001 (1.177) 


Inserting the five values of the function and the corresponding values of (0o, G i), 
we obtain a set of linear equations that can be solved for the Taylor coefficients 

b\, 

The optimum point must satisfy 


3d (0o, #1) 
30o 


— b\ +2Z?30q + ^50i =0 


(1.178) 


and 


dd (0o, 0i) 
30i 


= b2 + 2b 4 Gi + Z?50 o = 0 


(1.179) 
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This can be written as 


~2b 3 

^5 

Oo 


b\ 

_ b 5 

2b 4 

0\_ 


p2_ 


which is solved to yield (Oq, 0\) and hence the peak location. 


(1.180) 


A.1.10 Generalized Likelihood Ratio Detector 


Consider the binary hypothesis-testing problem defined in Equations (11.591) and 
(11.601) . Let Y = [y[l],..., y[/C]] be the M xK received data matrix collected 
over K snapshots. The generalized likelihood ratio (GLR) detector for a binary 
hypothesis-testing problem with unknown parameters involves the calculation 
of the likelihood ratio 


LR = 


max p J(Y,H u pi) 

max pi f(Y,Ho,po) 


(1.181) 


where/(Y, po |//o) is the probability density function (p.d.f.) of the measurements 
under hypothesis Ho, with po being the “nuisance parameters” affecting the p.d.f. 
Similarly, f(Y,p\\H\) is the p.d.f. under H\, with p\ being the corresponding 
“nuisance parameters.” 

In the case where/(•) is a multivariate Gaussian distribution function, the 
log likelihood function has the form 


L = tracejY H (R- 1 (p 0 )-Rr 1 (Pi))Y)+^log^ 


Ro (po)l 
(Pi)l 


(1.182) 


where Ro and Ri are the covariance matrices of the data vector under Ho and 
H \, respectively. The data is assumed to have zero mean under both hypotheses. 
The variables po,p\ are the maximum likelihood estimates of po and p\ under 
Ho and H \, respectively. 

In the resolution problem considered in this chapter, we have po = [o'o, Oo] 
and pi = [ai,0\,a2, 02 ]. The covariance matrices are 

Ro=o' O a(0 o )a H (0o) + o' 2 I (1-183) 


and 


Ri = a\a(0i)a H (6\) + a2a(02)a H (O 2 ) + cr 2 I (1.184) 

To calculate the log likelihood ratio, we need the maximum likelihood estimates 
of ao, Oo under Ho and of 0 \ , 62, a\ , 012 under H\ . The maximum likelihood esti¬ 
mate of $o can be obtained using the beamformer and searching for the direction 
of the maximum output power. The estimate of ao is then given by 

K 

ao = a ff (0 o )y>M/|a ff (0o)| 2 

k=l 


(1.185) 
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The maximum likelihood estimate 6 \, 62 is given by numerical minimization of 
the cost function 


K 


d(6u0 2 ) = Y2y H W p 


JL 

S(01,02) 


y[k] 


(1.186) 


k= 1 


where 


S(0!, 0 2 ) = [a(^i), a(^ 2 )] 


(1.187) 


After evaluating 0\,0i, the estimates of a 1 , ai are given by 

K 


a\ 

Oil 


1 


= (S if (0i,0 2 )S(0i,02)) _1 S ff (0i,02)-VyW 

K 


(1.188) 


k= 1 


Inserting the estimated parameters into Equations dl.1831) and (11.184b . we can 
now calculate Ro and Ri and the log likelihood ratio L. 

The GLR detector operates by comparing the log likelihood ratio to a thresh¬ 
old value Th. If L > Th , we choose H\ (i.e., we decide that two signals are present). 
If L <Th, we choose Ho (i.e., we decide that only one signal is present). 

The detector can make two types of error: deciding H\ when Ho is true, or 
deciding Ho when H\ is true. The probabilities of these events are Pr{L > Th\Ho) 
and Pr{L < Th\H\], respectively. The overall error probability P e is then given 
by 


P e = Pr{L > Th\Ho)Pr{Ho) +Pr{L < Th\Hi}Pr{Hi} 


(1.189) 


where Pr{Ho } and Pr{H\] are the a priori probabilities of having one or two 
signals. 

Assuming equal a priori probabilities, we have 


P e = 0.5(Pr{L > Th\H 0 } +Pr{L < Th\H\}) 


(1.190) 


Note that this probability is a function of the selected threshold. It is always 
possible to select this threshold so as to minimize error probability. We will refer 
to this value of P e as the minimum probability of error (MPE). 

For a given signal-to-noise ratio the resolution limit can be defined as the 
separation between two signals for which the MPE equals a desired value—say 
MPE = 0.1. Evaluating the resolution limit for different SNRs shows that in gen¬ 
eral the resolution limit norm alized by the beam width is inversely proportional 
to the fourth root of the SNR (iFriedlander . 2009b . 
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Practical Aspects of Design 
and Application of 
Direction-Finding Systems 


Franz Demmel 


2.1 INTRODUCTION 

The rapid development of radio communications has resulted in the increasing 
importance of direction finding, at the same time significantly boosting the asso¬ 
ciated requirements and complexity of direction-finding (DF) systems. Grasping 
this level of complexity requires a sufficiently broad theoretical foundation and 
(more than ever) the ability to apply it to practical situations. Ultimately, direc¬ 
tion finding is an interdisciplinary field, and its superstructure in signal theory 
falls in the field of array signal processing. However, antennas and wave propa¬ 
gation, radio frequency (RF) circuit technology, data communications, software 
engineering, and the like, are also essential in this context. Practical implementa¬ 
tion of ideas and concepts always involves consideration of applicable economic 
constraints as well as management of issues such as size, weight, and operation. 
The following sections will discuss how these different aspects interact as part 
of any practical implementation of direction-finding systems. 

2.2 APPLICATION OF DIRECTION-FINDING 
SYSTEMS 

Direction finding plays a key role in many areas of radio communications. The 
great majority of these involve estimating the direction of an emitter relative to 
a specified reference direction (true north, magnetic north, heading of a vehicle, 
airplane, or ship). 


Classical and Modern Direction-of-Arrival Estimation 
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In direction finding, the direction of the emitter is specified as in navigation 
as follows: 

• Azimuth angle a computed in the clockwise direction from the reference 
direction 

• Elevation angle e from the horizontal plane in the direction of the zenith 

For characterizing antenna properties and for theoretical considerations, 
angular notation is used for polar coordinates: 

• Azimuth angle <p computed in the counterclockwise direction from the 
v-axis 

• Zenith or polar angle 0 computed from the positive z-axis 
We thus have 


a = n/2 — (p 
£ = 7r/2 — 0 

In any case, the wave incidence angles represent merely an intermediate result 
for the user because the estimated directions of the incoming waves are in general 
different from the direction of the emitter of interest. Any practical implementa¬ 
tion of a direction-finding system must take this fact into account, primarily by 
minimizing the following influences: 

• Multipath propagation 

• Co-channel interfering signals 

• Interference generated internally in the system 

One of the most important uses of direction finders involves fixing the posi¬ 
tion of emitters. This can include rescuing shipwrecked persons and locating 
unknown, illegitimate signals such as undesired emissions from an industrial 
facility or unlicensed base stations, or signals to be used to remotely detonate 
improvised explosive devices (IEDs). In the area of military radio reconnais¬ 
sance, radio position fixing plays a key role in gaining tactical structure pictures 
and thus in assessing the potential threat of an enemy. 

A popular method for obtaining exact position results involves the use of 
multiple direction finders. The bearings they provide are used in a center station 
to determine location through triangulation. 

Single-station locators (SSLs) use only a single direction finder, although in 
actual practice two techniques are used. Position fixing with shortwave signals 
propagated via the ionosphere involves the use of direction finders that measure 
the elevation of the incident wave in addition to the azimuth. Based on an estimate 
of the virtual height H of the ionosphere’s reflecting layer, the distance to the 
emitter can be determined; the intersection with the line of bearing (LOB) then 
yields the emitter location. 

Single-station location is also possible with mobile direction finders (called 
“running fix”). Here, too, position is determined through triangulation, with the 
LOB being tracked to the intersection as determined from different positions 
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DF Position 2 


DF Position 1 
n / 



(b) 



(c) 

FIG URE 2.1 Locating emitters using direction finders: (a) triangulation; (b) single-station location 
by running fix; (c) single-station location by estimation of elevation angle and virtual height of the 
reflecting ionosphere layer in the high-frequency range. 


of the mobile direction finder. Figure l2Jl provides an overview of the different 
techniques. 

Position fixing for emitters is often a multistep process. Direction finders 
distributed countrywide make it possible to fix the position of an emitter through 
triangulation with an accuracy of several kilometers (typically 1-3% of the dis¬ 
tance between direction finders). The next step in localizing the emitter is the 
use of direction finders installed in vehicles. Finally, portable direction finders 
are used to focus the search on the last 100 meters (e.g., in buildings). 

However, direction finders have other important applications besides fixing 
the position of emitters. Information about the angle of incidence of radio signals 
can be exploited very well for the following purposes: 

• Spatially selective searches for signals 

• Separation of signals (e.g., for resolution of complex radar scenarios 
in EW) 

• Segmentation of spectra with the aid of wideband direction finders, allow¬ 
ing improved estimation of the center frequency and the bandwidth of 
unknown signals 

• Control of directional antennas in communications systems (e.g., for 
spatial division multiple access (SDMA)) 

Many of the latest applications require broad or multichannel direction find¬ 
ing. One important example is searching for unknown signals and estimating 
their parameters, since in most cases the spatial spectrum of an emitter exhibits 
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much higher contrast than does, say, the power spectrum. Figure I2f2l shows by 
way of example the power and spatial spectra of a conventional narrowband 
signal in comparison to the spectra of a signal with spectrum spreading. 

Because spectral components with the same bearing have a high probability of 
arriving from the same emitter, direction finders are very efficient for segmenting 
signals. The example shown in Figure 1231 illustrates how this works even with 



--► -;-F-► 

Direction Direction 

(a) (b) 


FIGURE 2.2 Power spectrum and spatial spectrum of a conventional narrowband signal (a) and 
a wideband signal (b). The contrast remains unchanged in the spatial spectrum. 



FIGURE 2.3 Signal segmentation of dense scenarios through evaluation of the spatial spectrum. 






















































































Typical System Design—Overview 


very dense frequencies: While the power spectrum provides no indication of the 
different emitters, the spatial spectrum allows them to be clearly distinguished. 

2.3 TYPICAL SYSTEM DESIGN—OVERVIEW 

All of the latest direction-finding systems use the structure shown in Figure l2~4l 
As shown, the N output signals from spatially distributed antenna elements 
are mapped in an RF network among H coherent measurement channels, and 
digitized and fed to a signal processing unit where they are used to estimate 
parameters of interest such as 

• Number of incident waves (detection) 

• Azimuth and elevation of detected waves 

• Polarization of detected waves 

• Power levels, modulation signals, and the like 

In the simplest case, a measurement channel is assigned to each antenna ele¬ 
ment: H = N. For economic reasons, it is often necessary to minimize the number 
of measurement channels. In this case, the RF network can be implemented as 
a multiplexer or beamformer, for example. If a multiplexer is used, the antenna 



FIGURE 2.4 Typical block diagram of a DF system. 
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array is divided into subarrays, with H elements each, and connected sequentially 
to the measurement channels. This results in an increase in measurement time 
by the number of sequential measurement steps so that the necessary minimum 
signal duration also increases, meaning that a compromise between expense and 
measurement time is required. 

Prior to digitization of the signals, filtering and—if the analog/digital 
converters (ADCs) do not allow processing at the frequency of the antenna 
signals—conversion to a suitable intermediate frequency (IF) are required (tuner). 
The input signal is mixed in one or more stages with an unmodulated signal, and 
the result is filtered at the sum or difference frequency (corresponding to the inter¬ 
mediate frequency). For subsequent evaluation of the phase differences, coherent 
behavior is required from the measurement channels. Accordingly, a common 
local oscillator (synthesizer) is used in most cases for the frequency conversion 
along with common clock sources for analog-to-digital (A/D) conversion. 

To reduce slowly fluctuating amplitude and phase deviations (e.g., due to tem¬ 
perature and aging), each measurement channel can be supplied with defined 
calibration signals. With these, the transmission characteristics of each mea¬ 
surement path can be assessed at regular intervals and stored for subsequent 
compensation of the antenna measured values. 

At the digital signal processing level, the signals are first down-converted to 
the complex baseband. This is followed by the main filtering, which, depending 
on the particular application, is handled either in the form of a filter bank or 
for a selected central channel in the case of simple systems for direction finding 
involving narrowband signals. Filter banks are used in wideband direction find¬ 
ers. During the next processing step, a quantity of samples determined based on 
the selected averaging time is collected and fed to the algorithms for estimation 
of the bearings and any other parameters of interest. 

This part of the processing typically involves the use of field programmable 
gate arrays (FPGAs) because of the necessary high processing speed and the 
tight coupling with hardware. Standard processors such as those found in today’s 
PCs are typically used for further processing of the estimated parameters. All 
of the computations not subject to any challenging real-time requirements are 
performed here, including 

• Computation of position and compass data 

• Computation of platform-dependent compensation data 

• Setting of hardware parameters 

• Management of system hardware 

• Execution of internal test routines in the system 

• Data output (typically via an Ethernet interface) 

Standard PCs are used for displaying results and to output and/or record the 
waveforms at the baseband level, and for the demodulated signals. Figure [231 
shows the components in a practical direction-finding system for the frequency 
range from 20 MHz to 3 GHz. 
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FIGURE 2.5 Practical implementation of the components in a direction-finding system for the 
VHF/UHF range. (Photo courtesy of Rohde & Schwarz.) 


2.4 PERFORMANCE PARAMETERS 

The main requirements for a direction-finding system are as follows: 

• High accuracy 

• High sensitivity 

• Sufficient large-signal immunity 

• Immunity to distortion of the received wavefield as a result of multipath 
propagation 

• Stable performance in case of co-channel interfering signals 

• Short minimum signal duration 

• High search speed and probability of intercept in scanning direction 
finders 


2.4.1 Accuracy 

The accuracy of a direction finder is characterized by the bearing errors that 
result from variations in emitter position and frequency. The signal-to-noise 
ratio (SNR) of the signal is selected to be high enough that the possibility of 
errors caused by noise can be neglected. Conventionally, the root mean-square 
(RMS) value is used to characterize the errors. In the frequency range from 1 
MHz to 3 GHz, errors of between 1° and 2° RMS can be expected in commercial 
direction-finding systems. 


2.4.2 Sensitivity 

The sensitivity of a direction finder means its ability to produce accurate 
direction-finding results even when faced with low-amplitude signals. Conven¬ 
tionally, the RMS value of the electrical field strength of a narrowband signal 
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is specified for which a certain bearing error is not exceeded (e.g., 3° RMS). 
The best possible sensitivity that can be attained is dependent primarily on the 
size of the antenna array, the gain and noise factor of the antenna elements, 
the noise factor of the DF converter, and measurement time. What are known as 
the Cramer-Rao limit bounds (CR LB) are of great practical interest when esti¬ 
mating this limit (I Van TreesLl2002l) . If we consider a case important in practice—a 


uniform circular array (UCA) with N omnidirectional elements and a diameter 
D at wavelength A—we ob tain the followin g for the lower limit of the variance 
of the azimuth bearing ( Van Tree si 20021) : 


cr? > 




1 


+ 


1 


( 2 . 1 ) 


0 - n 2 D 2 N K V SNR N SNR 2 

We assume a wave at an angle of elevation of 0. Equation (12.1 b does not depend 
on the azimuth angle 0 because of the balanced symmetric nature of UCAs 
dManikasl . 120041) . K is the number of snapshots used to compute the bearing, and 
SNR is the signal-to-noise ratio at the output of one of the N receiving paths, 
which are assumed to be identical: 


SNR = — = 


n 


E 2 


( 2 . 2 ) 


E is the signal field strength, and E n is the equivalent noise field strength of the 
instrument noise—that is, the field strength that produces the same output noise 
power P n assuming a noise-free system. 

For high SNR, 
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SNR TV SNR 2 


we obtain 


E> E 
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- 


o^ttD^/N K 


(2.3) 


For example, a system with a 5-element antenna with D/A = 0.5, a bearing vari¬ 
ance of (7^ = 2° = 0.0349 rad, and one snapshot would require a field strength of 
£>4.08£„. 


2.4.3 Large-Signal Immunity 

The large-signal immunity of a direction finder characterizes its behavior in the 
presence of high-amplitude signals outside the frequency channel containing the 
signal of interest. Because of the increase in transmitter density, this aspect plays 
an increasingly critical role in the implementation of practical direction-finding 
systems. 

Saturation effects, inter-modulation, and reciprocal mixing are the main fac¬ 
tors influencing large-signal immunity. Saturation effects are relevant primarily 
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RF IF 


FIGURE 2.6 Inter-modulation products of the second order (frequency fa —f\) and third order 
(frequency 2 fy —f \) lying within the receiving band. 


with A/D conversion. Inter-modulation is an effect where two or more large sig¬ 


nals are mixed because of the nonlinear transfer charac 


devices such as amplifiers and mixers (IBussgang et al 


eristics of electronic 


11974 . The mixing 


products fall within the tuned frequency band (Figure 12^61) . 

Inter-modulation is typically characterized using what is known as the “inter¬ 
cept points” IPof order k. In real-world systems, inter-modulation products of 
the second and third orders predominate. 

Using the intercept point IP 2 we obtain the following relationship between 
the power of the second-order mixing products P m 2 at frequency f m 2 and the 
interfering signals Pj\ and P /2 at frequencies f\ and/ 2 : 


P m2 — 


Pi 1 P/2 

IP2 


(2.4) 


assuming small-signal conditions and the frequency relationship 


/m2 = l/l ±/21 


(2.5) 


For the power of the third-order mixing products, we obtain 


P m3 — 


pP ptf pr 

IP 3 


p+q+r =3 


( 2 . 6 ) 


The frequency relation between interferers and mixing product satisfies 


fnii = \pfl i Cjfi i tfi | 


(2.7) 
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Conventionally, the intercept point is specified as a power level refered to 1 mW 
in dBm, IP/^dBm) = 101g(IP£/lmW): 


IP 2 =Pj i + P I2 — P m 2 (dBm) 

to pPn+qPn + rPn—Pmz A ^ 
IP 3 =--- (dBm) 


( 2 . 8 ) 


Typical values of the antenna output/DF converter interface are around 50 dBm 
for the second-order intercept point IP 2 and 10 dBm for the third-order intercept 
point IP 3 . 

Reciprocal mixing arises in sampling systems such as ADCs and mixers. It 
is due to phase fluctuations in the sample signal (e.g., clock generator or local 
oscillator). These fluctuations cause a widening of the signal in the frequency 
domain, which is known as sideband noise. If a strong interfering signal now 
reaches the sampling device, the sideband noise will be mixed into its sidebands. 
If the signal of interest is located in these sidebands, there will be a reduction in 
the SNR (Figure [277b . 

The noise power P np caused by reciprocal mixing within a narrow receiving 
band of width B is determined by 

P np = P I BlO a r n/l ° (2.9) 


where a pn is the spectral power density of the single-sideband phase noise of the 
local oscillator or other sampling device related to the carrier (usually stated in 
dBc/Hz). Pj is the power of the interfering signal. 

Consider the phase noise of a high-quality local oscillator specified as 
— 120 dBc/Hz at a frequency offset of 100 kHz. The receiving bandwidth is 
10 kHz that occurs as an interfering signal with a power level of —30 dBm 
at a distance of 100 kHz from the tuned frequency. We get a noise power 
of P np (dBm) =P/(dBm) + \0\gB + a pn = —110 dBm due to reciprocal mixing. 



FIGURE 2.7 Reciprocal mixing. 
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Compared with a device having a noise figure of 10 dB we obtain a noise power 
of just —124 dBm. 

2.4.4 Immunity to Distortion of the Received Wavefield 
Due to Multipath Propagation 

Under normal circumstances, wave propagation between the emitter and the 
direction finder is disrupted by buildings, mountains, hills, and the like. Even if 
we have direct line of sight, secondary waves still arise, because of reflection, 
refraction, and diffraction, which are then superimposed on the direct wave at 
the receiving location as interfering fields. If the interfering wave component has 
a lower power level than the desired wave component, the bearing error can be 
minimized by choosing suitable dimensions in the direction finder. In the sim¬ 
plest case involving single-wave algorithms, we just need to select a sufficiently 
large antenna aperture. With high-resolution direction-finding algorithms we can 
separate the dominant wave angle of incidence so that it becomes possible to 
estimate the direction of the emitter (e.g., by employing propagation models). 
The disadvantage here is the large amount of information required in advance 
(besides the significant computational effort). 

Special attention must be paid to the possibility of errors known as “wild 
bearings.” These are caused by violation of the Spatial Sampling theorem above 
a critical frequency and starting at a certain amplitude ratio for the secondary to 
primary waves, usually because too few antenna elements are available within 
the antenna aperture or because coupling effects within the antenna array lead 
to mutual dependencies between the element outputs. 

2.4.5 Immunity to Co-Channel Interfering Signals 

Unlike multipath interference, co-channel interfering signals usually occur in 
the form of uncorrelated signals within the frequency band that contains the 
signal of interest. This can be caused, for example, by electrical equipment and 
systems (sparks on switches and power sinks), overshoots in transmitters, or 
defects in transmitting systems. Correlated co-channel interfering signals occur 
when fixing the position of common-wave emitters. 

2.4.6 Short Minimum Signal Duration 

Advanced transmission techniques sometimes use time division multiple access 
(TDMA), frequency hopping, or burst transmission. Direction finding is possible 
with such signals only if we can process the antenna signals with sufficient speed 
and with the required signal-to-noise ratio and make them available for use in 
computing the bearings. Short signal durations also imply that few snapshots are 
available for averaging so that larger antenna systems are needed to approach 
the same sensitivity available with continuous signals. 
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2.4.7 High Search Speed and Probability of Intercept 
in Scanning Direction Finders 

For dependable intercept and direction finding with signals having an unknown 
frequency, the possible frequency range must be processed within an interval of 
time that is as short as possible compared to the transmit duration of the signal 
of interest. 

2.5 ANTENNA ARRAY DESIGN 

The characteristics we can obtain with direction finders are dependent exclusively 
on the characteristics of the antenna array in addition to the system’s intrinsic 
noise: 


• Sensitivity (field strength value for a certain bearing fluctuation) 

• Probability of wild bearings (bearing jumps, outliers) 


The standard model of array signal processing expresses this fact in the nar¬ 
rowband approximation for N sensors and M waves as the relationship between 


the measurement vector x(t) a nd the array manifold vecto 
Sj(t), and the noise vector n (t) ( Van Trees . 20021 ; Wax . 19951) : 


~s asignals 


M 

x(0 = X! J y(Oa(0y, 0 j) + n (0 ( 2 . 10 ) 

1=1 


The summation occurs over the M waves superimposed on the frequency 
channel of interest. The components a/(0, 6) of the array manifold vectors (or 
steering vectors) represent the complex relative directional characteristics of the 
antenna outputs 

a(0, 6) = [a\ (0, 6) ...at (0, 0) ...a^ (0, 0)] T (2.11) 


where 0 represents the azimuth angle and 0 the zenith angle. The continuum 
spanned by the vectors a(0, 6) in the V-dimensional complex vector space C N 
is the array manifold and shows, so to speak, the directional characteristics of 
the N array outputs at a glance. 

The model (12.101) illustrates the key importance of the array manifold vectors: 


They transform the change in wave angle of incidence into a change in mea¬ 
surement vector, which we can take advantage of depending on the signal and 
noise power. This is express ed by the Cramer-Rao limit bound for attainable 
direction-finding sensitivity (Manikas, l2004IVan Treesll2002l> . 

If, for different wave angles of incidence, array manifold vectors assume sim¬ 
ilar directions in C N (potentially first-order ambiguities), even insignificant 
interference (e.g., due to noise) will cause wild bearings. 

If linear combinations of two array manifold vectors with a different wave 
angle of incidence assume similar directions in C^-like array manifold vec¬ 
tors for other wave angles of incidence, then we have potentially second-order 


































Antenna Array Design 


ambiguities, which can result in wild bearings depending on the signal 
scenario. This also applies to higher orders of ambiguity. 

These aspects determine the main design criteria for a DF antenna array: 

Largest possible antenna aperture. Large changes in the array manifold vec¬ 
tors for changes in the wave angle of incidence are achieved using antenna 
dimensions that are large with respect to wavelength. From this, the array 
geometry can be derived: Circular arrays are useful for panoramic vision in 
the azimuth, while linear arrays are useful for intercept angles below 90°. 
For intercept angles greater than 90°, it makes sense to use partial circular 
arrays. 

Number of antenna elements. For a given aperture size, ambiguities can be 

avoided by including a sufficient number of antenna elements (=-dimension 
number of the array manifold). The Spatial Sampling theorem provides a 
clear justification for this requirement. In any practical implementation, we 
must naturally take into account the fact that the coupling between the ele¬ 
ments will increase with a larger number of elements. This can lead to a loss 
of sensitivity. More elements also entail a larger number of measurement 
paths if we do not wish to increase the measurement time. 

We must also recall that in practice the parameter space generally extends 
well beyond the wave angles of incidence <fi and 0. In particular, the polarization 
dependency of the array manifold vectors and the influence of different support 
platforms (e.g., mast, vehicle) on the array manifold vectors result in bearing 
errors if these issues are not taken into account when designing the antenna or 
characterizing the array manifold. 

A proven design approach involves configuring the antenna array so that 
the sensitivity of the array manifold vectors with respect to the parameters to 
be assessed is maximized and the sensitivity with respect to parameters not of 
interest is minimized. 


Example 2.1 

We want to design an antenna array for use in estimating the azimuth of an 
emitter. Since we are not interested in elevation, we can use an antenna geom¬ 
etry that extends horizontally. Also, we are not interested in polarization, so we 
should select our antenna elements based on the preferably expected polariza¬ 
tion and (through careful decoupling of the feed lines and support structures) 
ensure that the array manifold vectors are largely independent of any possible 
fluctuations in the polarization parameters. Because we will use the antenna on 
masts as well as on vehicles, we must also strive for good decoupling between 
the antenna elements and the support structure. This is achieved primarily 
through the use of symmetrical antenna elements with very high common-mode 
rejection. 
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A second workable approach involves taking into account the parameter space 
as fully as possible when characterizing the antenna vectors. The disadvantage 
here is the increased number of unknowns, which when reestimated will require 
a larger number of antenna elements and thus a higher implementation cost. In 
addition, the cost of measuring, storing, and managing the parameters increases 
as the number of parameters to be taken into account increases. 


2.5.1 Design Example 

Figure l2^8l shows an antenna array for direction finding predominantly with ver¬ 
tically polarized signals in the frequency range from 20 MHz to 3 GHz. The 
outer circular array antenna has a diameter of approximately 1 m and covers the 
frequency range from 20 MHz to 1.3 GHz. It consists of nine dipole elements 
equipped with a high-impedance differential amplifier at the dipole feedpoint to 
optimize gain and sensitivity. The top capacitances of the dipole elements also 
help to boost the gain at low frequencies. At high frequencies, the radiators can 
be electrically shortened using integrated switches to reduce mutual coupling 
and minimize backscatter within the operating frequency range of the inner 
antenna array. The elements exhibit an approximately omnidirectional charac¬ 
teristic in the horizontal plane, while the directional characteristic of the phases 
corresponds to the expected cosinusoidal shape: 


. jkoRsinOcos((/)- 27t ^ N 1) ) 
<2/(0, G)& sin 0e v N J 


( 2 . 12 ) 


r\ 

where R is the array radius, ko = and the element number N = 9. 



FIGURE 2.8 Antenna array used for direction finding in the frequency range from 20 MHz to 
3 GHz. The upper half of the radome and the lightning rod in the center of the antenna have been 
removed. (Photo courtesy of Rohde & Schwarz.) 








2.6 Number of Antenna Elements and Processing Channels 


The inner circular array consists of eight dipole elements arranged in front 
of a cylindrical reflector. The omnidirectional reference element is implemented 
through the dipole elements’ summation. The pin located in the center of the 
antenna is used to accommodate a lightning rod. 


2.6 NUMBER OF ANTENNA ELEMENTS 
AND PROCESSING CHANNELS 

To obtain unambiguous DF results, the following parameters must satisfy certain 
conditions: 

• Number N of antenna elements 

• Distance between antenna elements 

• Number M of waves to be resolved 

• Rank r\ of the signal’s correlation matrix 

• Array manifold 


Wax and Ziskindl (119891) make the basic assumption that any arbitrary set of 


N array manifold vectors a/ (0), i = 1,2,..., N with disjunct parameter vectors 
© are linearly independent. Based on this assumption and where the rank of the 
noise-free correlation matrix S of the signals is 


ri — rank (SS H ) < M 


(2.13) 


it is shown that 


M < 


N + 7j 


(2.14) 


wave s can be subjected to unambiguous direction finding (IWax and Ziskind . 
19891) . For the limiting case of uncorrelated signals (r/ = M) we obtain the known 
criterion M<N orM<N — 1. 

The starting assumption is critical for antenna array implementation since 
the linear independence of the array manifold vectors must be maintained with 
the sparsest possible configuration of the antenna array, taking into account error 
influences. From a practical point of view, this aspect plays a predominant role 
in determining the minimum number of elements for a given array size. 

Maximum sensitivity and minimum required signal duration can be ensured if 
each antenna element is processed using a separate processing channel. If we wish 
to simultaneously obtain good large-signal characteristics, such solutions are too 
costly in many cases. Thus, we have to deal with a reduced number of processing 
channels and divide the antenna array into subarrays. Coherent measurements are 
made within the subarrays, which are processed sequentially. Figure 12^91 shows 
the basic block diagram of this kind of arr ay data acquisition. Directio n-o f-arrival 
DOA ) estimation methods are treated in Sheinvald and Wax ( 19991) and lWormsi 
20021 ) . 

Use of only a single measurement path represents a special case. Its practi¬ 
cal significance is that existing receivers can be extended with relative ease to 
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FIGURE 2.9 Sequential measurement of N antenna elements with H processing channels. 


include a direction-finding function dPemmel and Unselil2002l) . Since the phase 
reference from a second measurement path is lacking, the phase information must 
be obtained indirectly through amplitude measurements in this case. The basic 
principle involves performing this amplitude measurement in several steps with 
different antenna states and then computing the wave angles of incidence based 
on changes in the measured signal power and known changes in state. 

Figure 12.101 shows an example of this. The phase difference p\k between 
reference element 1 and element k e {2... N } is determined by adding up the ele¬ 
ment outputs, which are phase-modulated in four states (0°, 90°, 180°, and 270°) 
with reference element 1. The 4(N — 1) measurement of the square magnitudes 
of the antenna output are required. 

For one measurement cycle, we have 


• Measurement 1: = \x\ -\-Xk\ 2 = \x \\ 2 + Iv^l 2 +x\x% -\-Xkx\ 

• Measurement 2: = \x\-Xk\ 2 = \x\ | 2 + |v^| 2 —x\x£-Xkx\ 

• Measurement 3: yk 3 = \x\ — jxk | 2 = \x\ | 2 + \xk | 2 + jx\x% —jx^x^ 

• Measurement 4: yuA = \x\ + jxk | 2 = \x\ | 2 + \xk | 2 —jx\x% + jxkx* 




















































2.6 Number of Antenna Elements and Processing Channels 



FIGURE 2.10 Example of DF with N antenna elements and a single processing channel. 


By taking the difference, we obtain the real and imaginary parts of the conjugate 
complex products x^xp 


yk\ -V7 f 2=4Re(x | A i *) 

y k 3-y k4 = -4lm(x l xl) 


For the phase difference: 


Pik = arctan 


yk3 —yu 
yk\ —yia 


(2.15) 


(2.16) 


Another possibility for direction finding using only a single measurement path 
involves the use of antenna arrays consisting of parasitic radiators. This pub¬ 
lished technique is also ref erred to as an electronical ly steerable parasitic array 
radiator (ESPAR) antenna (ICheng and Ohiral. 120061) and is based on the idea 


of intentionally influencing the directional characteristic of the antenna element 
connected to the measurement channel by changing the impedances on the par¬ 
asitic radiators. Based on the measured amplitude changes at the output, we can 
then compute the wave angles of incidence. 
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2.7 MULTICHANNEL RECEIVERS 

To achieve sufficient phase and amplitude balance, identical processing channels 
are needed to implement the measurement paths. These channels convert the 
bandpass signals from the antenna outputs into signal samples in the complex 
baseband. Basically, two methods can be used here. 

Direct receiver. The antenna signals are band-limited to a bandwidth B (reducing 
the input power and avoiding problems with aliasing) and are then fed directly 
to the A/D converters and digitized using a suitable sampling rate fs > 2 B 
(Figure|2jj]). If possible, the sampling clock should be taken from a common 
source as shown in the figure to ensure the required sampling synchronization. 
If clock synchronization cannot be managed via a direct connection, the 
sampling clock must be derived from precise timing standards. 

Down-conversion receiver. If direct A/D conversion is not possible for techni¬ 
cal reasons, the frequency band of interest must first be shifted to a lower 
frequency (LF). In that case, the antenna signals are mixed with a signal gen¬ 
erated in the local oscillator (LO). Based on the mixing products that arise, 
a frequency band that is suitable for further processing is then filtered out 
with center frequency fjp and fed to the A/D converter. If the frequency is 
shifted to the center frequency fjp = 0, then we speak of a “direct conver¬ 
sion receiver” or “homodyne receiver.” However, a “heterodyne receiver,” 
or “superhet,” has fewer problems with undesired mixing products. It uses 
an IF with a filter to provide good rejection of undesired mixing products as 


RF Inputs 



Baseband Outputs 

FIGURE 2.11 Direct receiver with multiple channels. 
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Multichannel Receivers 


RF Inputs 



Baseband Outputs 

FIGURE 2.12 Multichannel superhet. 


well as a favorable frequency for the A/D conversion process (iPace . 2000l) . 
Figure l2H2l is a schematic diagram of a multichannel superhet receiver. 


Besides providing for sampling synchronization during A/D conversion, we 
must also ensure that the mixers in each processing channel are driven with the 
same phase. In both concepts, adaptation of the receiver’s dynamic range to the 
current signal scenario plays an important role. A separate detector is provided 
in each processing channel to deliver the signal needed for fast automatic gain 
control (AGC) to avoid overdriving the A/D converter. 

The following issues are critical in the receiver design: 

• No undesired mixing products should appear in the IF band. 

• The number of mixing stages should be minimized. 

• Each mixing stage should have as little power applied to it as possible. 

Since these issues entail relatively stringent limits when selecting the LF frequen¬ 
cies, the A/D converter is often implemented in bandpass sampling mode—an 
IF band from 47 MHz to 67 MHz and a sampling frequency of 76 MHz in the 
A/D converter, for example. 
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2.8 WIDEBAND DIRECTION FINDING 

A wideband signal is any signal with energy distributed over a bandwidth that 
is not small in comparison to the signal’s center frequency. Direction finding of 
that kind of signal requires methods that are not addressed by the procedures 
traditionally used for estimating a signal’s direction of arrival. 


2.8.1 General 

Spectrum spreading is used increasingly in radio communications nowadays. It 
includes 


• Frequency hopping (FH) 

• Direct-sequence spread spectrum (DSSS) 


In many applications time-compressed (i.e., wideband) burst signals are also 
important. Figure IXT31 illustrates the basic functioning of such low-probability - 
of-intercept (LPI) techniques in the frequency time domain. Conventional 
continuous signals are also shown by way of reference. 

Using frequency hopping, the message to be transmitted is divided into small 
sections (typically lasting a few milliseconds depending on the frequency range) 
and transmitted using carrier frequencies that are varied in a pseudo-random 
manner. The receiver tracks the agreed pseudo-random sequence in order to 
recover the message content. 

In DSSS radio systems, the message signal undergoes an additional high¬ 
speed pseudo-random phase modulation, which causes the desired signal to be 
distributed over a wide-frequency band. Direction finding with signals of this sort 
necessitates bandwidths that are generally no longer small in comparison to the 
center frequency. Accordingly, our narrowband approximation from the signal 
model O) no longer holds, and we must now take into account the frequency 
response of the antenna characteristics. 

Frequency response compensation is commonly referred to as f ocusing in 
technical publications. Two approaches are possible (lKrolikLll991h . In coher¬ 
ent focusing, the measured antenna signals are multiplied by a compensation 
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FIGURE 2.1 3 Typical LPI signals and classic continuous signals in the frequency time domain. 























2.8 Wideband Direction Finding 


term prior to computation of the bearings. This term compensates for the delay 
differences between antenna elements. Accordingly, information is needed in 
advance about the approximate wave angles of incidence. In noncoherent focus¬ 
ing, the frequency range we are monitoring is divided into subranges using a 
filter bank. Direction finding is then performed based on the narrowband model. 
Related parts of the bearing spectrum obtained in this manner are combined and 
allocated to an emitter. 

In real-world applications, noncoherent focusing is most commonly used for 
the following reasons: 

• No a priori information is required. 

• Almost all of the signals are already separated in spectral terms, and even 
with densely occupied scenarios, the signals can be separated in most 
cases with simple single-wave algorithms. 

• Besides the wave angles of incidence, we can obtain estimates of the 
frequency ranges occupied by the emitters. 

Figure imi illustrates the functional principle of a wideband direction finder that 
uses noncoherent focusing. 

Following A/D conversion and digital down-conversion to the baseband, 
the real and imaginary parts of the wideband signal in each measurement path 
are fed to a digital filter bank, which is conventionally implemented as a fast 
Fourier transform (FFT) or polyphase filter bank. The filter bank breaks down 
the wideband signal with bandwidth Brt into Jr narrowband channels with 
bandwidth Be, where Brt =JfBc> 

In the next step in the processing chain, the filter bank outputs from each mea¬ 
surement path are combined to compute the correlation matrices. The number 
of snapshots used to estimate the correlation matrices is typically determined 
by the operator in the form of an averaging time. The minimum size of the 
correlation matrices depends on the particular DF algorithm: Techniques based 
on single-wave models need an N x 1 vector for each frequency channel, 
while multichannel algorithms require up to (N 2 +N)/2 correlation terms per 
frequency channel. 
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FIGURE 2.14 Functional principle of a wideband direction finder. 
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The correlation matrices and the frequency-dependent array manifold vectors 
form the input data for the bearing computation. The bearing spectrum obtained 
in this manner is then displayed with a graphical user interface and/or combined 
for emitters using cluster techniques (focusing). Figure l2J~5l is an example of 
this type of results presentation. 

Using today’s technology in combination with the necessary dynamic range, 
the real-time bandwidth is limited to several times 10 MHz. If the center fre¬ 
quency of the broadband signal is unknown, the frequency ranges we must 
analyze typically become much larger—several times 100 MHz in the V/UHF 
range and several GHz in the SHF range. We can implement a system for 
interception and direction finding over this larger frequency range through 


• Multiple filter bank receivers with shifted center frequencies 

• Sequential shifting of the filter bank frequencies (frequency scan) 


The first method ensures the maximum probability of intercept, but is costly 
to implement. The second method is significantly more economical to implement 
and is thus commonly used even though the probability of intercept decreases 
as the search range widens. The shifting of the filter bank frequency is typically 
carried out by changing the receiving center frequency of the RF converter (shift¬ 
ing of the LO frequency), whereas the A/D converter and the digital filter bank 
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FIGURE 2.15 Graphical results presentation for a wideband direction finder. The spectrum 
of bearings allows detection, DF processing, and frequency estimation of the spread-spectrum 
signal buried in the noise of a single receiving channel (lower part of the display). (Screenshot 
by Rohde & Schwarz.) 
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FIGURE 2.16 Basic operation of a wideband direction finder during scanning. 


operate at fixed IF center frequencies. Figure lZlbl is a schematic representation 
of how this works. 


2.8.2 Interception of Frequency-Hopping Signals 

Because of its great significance in actual practice, we now examine the prob¬ 
ability of intercept of a frequency-hopping signal using a scanning wideband 
direction finder. 

We assume that the frequency-hopping transmitter has Jfl channels with 
channel hopping in a random sequence. The direction finder’s filter bank covers 
Jf channels. During scanning, the filter bank is shifted in L steps so that during 
one search cycle J$c = LJp channels are covered. We also assume that the fre¬ 
quency channels of the frequency-hopping transmitter and the search range of 
the direction finder overlap by Jj channels (Figure [2T7]). 

The length of a hop is equal to f//. We assume that the direction finder’s 
measurement time for a snapshot is equal to tM , where tM >tu- Changing the 
direction finder’s center frequency (shifting the filter bank) requires a duration fy. 

The probability P\ of detecting a single FH burst with the direction finder is 



JcJf 

JhJsc 


(2.17) 


The average number n av of valid detection attempt s (i.e., t 


place within one FH burst) is obtained as follows (lHorind.119981) 


le measurement takes 


tH — tM 

^av — 


ts T tM 


(2.18) 
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FIGURE 2.1 7 Interception of a frequency-hopping signal using a scanning filter bank receiver. 


The detection of a burst for n av valid detection attempts is then 

P lh = Pmav = j§J^ (t M + ts) + tu) 

P\h=J^ tH>( J -jf(tM + ts)+tM ) (2.19) 

P\h=l JsC>Jh, tH>( J -jf(tM J rts) J rtM S j 

The probability of detection of k bursts with multiple measurements and 
transmission of a total of N bursts is computed based on the binomial distribution 

k 1 

p(z >k)= { k ) p ' h (1 ( 2 - 20 > 

where N is obtained using the duration tjx of the FH transmission as 

try 

N=— (2.21) 

tH 


Example 2.2 

An FH transmitter is operating in the frequency range from 50 MHz to 80 MHz 
with 500 hops/s (tn = 2 ms) with a frequency grid of 25 kHz ( Jh = 1200). The 
direction finder has a real-time bandwidth of 10 MHz and scans the range from 
20 MHz to 90 MHz using 25-kHz-wide filter bank channels (i.e., Jp = 400, Jsc = 
2800). The measurement time for the direction finder is equal to tM = 0.32 ms; 
the settling time ts is equal to 1 ms. 
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FIGURE2.18 Probability of detection of and direction finding for at least 10 hops of a frequency- 
hopping transmitter as a function of transmit duration. 


We want to know the probability of detecting and determining the direction 
of at least k = 10 FH bursts during a specified time interval. Figure [2H8]shows 
the resulting probability as a function of the transmit duration tjx • 


2.9 IMPLEMENTATION ASPECTS OF HIGH-RESOLUTION 
DIRECTION FINDING 

In spite of the attractive features of high-resolution methods and the large amount 
of literature on the subject, most of the currently used direction-finding tech¬ 
niques are based on classical single-wave algorithms. Possible reasons will be 
discussed in the following section. 

2.9.1 Conventional versus High-Resolution 
Direction Finding 

If unwanted interfering waves are present in the frequency channel of interest, 
in addition to the desired wave, bearing errors will occur using conventional DF 
techniques based on single-wave models depending on the antenna configuration 
(geometry) used. There are two approaches for solving this problem: 

• If the interfering wave component has less power than the desired wave 
component, we can minimize the bearing error through the use of suitable 




























Practical Aspects of Design and Application of DF Systems 


dimensioning of the direction finder (especially by selecting a sufficiently 
large antenna base). 

• If the interfering wave component is larger than or equal to the desired wave 
component, we also have to determine the interfering waves in order to elim¬ 
inate them in turn. When using conventional beamforming algorithms, this 
means that we must also evaluate the secondary maxima in the DF function. 
We reach the relevant limits when the ratio between the primary maximum 
and the secondary maxima of the directional characteristic becomes too small 
or when the angular difference between the wanted and interfering waves is 
less than the width of the main lobe. By optimizing the weighting factors, 
we can reduce the level of the secondary maxima, but at the expense of 
simultaneous widening of the primary maximum. 


The objective of high-resolution (HR) direction-finding techniques is to cir¬ 
cumvent this disadvantage. Of course, in practical applications we must keep in 
mind that even if we know the quantity and direction of the waves, we still do 
not have definitive information about the direction of the emitter of interest. 

If the interfering waves are correlated with the desired wave (i.e., if there 
is multipath propagation), emitter direction must be estimated based on the dis¬ 
tribution of the directions and amplitudes of the waves we are analyzing. In 
many cases, this can be handled more efficiently using beamforming or corre¬ 
lation techniques since the estimation of the emitter direction from the coherent 
secondary waves is implicitly contained in the formation of the bearings here 
(assuming we have a sufficiently large antenna aperture). 

If we are dealing with uncorrelated signals from different sources, unambigu¬ 
ous assignment is possible only if all of the signals except the one of interest can 
be suppressed through suitable shaping of the antenna’s directional characteristic 
(signal copy filter, null steering, adaptive antenna), so that we can, for example, 
use modulation parameters to make the assignment. 

With the current state of the art, it is obvious that HR techniques are used as 
an extension instead of a replacement for single-wave techniques. This is due on 
the one hand to the fact that when exploiting the spectral and timing structure 
of the signals, separation and direction finding with single-wave algorithms are 
successful in the majority of cases. On the other hand, radio scenarios are highly 
transient so that general usage of a certain HR technique fails because of incon¬ 
sistent estimation of the number of involved waves. Accordingly, HR techniques 
are used more for targeted analysis of overlapping signals or for investigation of 
narrow sections of the receive spectrum and not so much with a direction finder’s 
search mode. Figure [2H9]shows an example. 


2.9.2 Practical Limitations of High-Resolution Methods 

When implementing HR techniques, it is essential to minimize 

• The deviation of the actual array manifold vectors from those used to 
estimate the direction (modeling errors) 

• The measurement errors for the array signals 







Implementation Aspects of High-Resolution Direction Finding 



) 



FIGURE2.19 Bearing spectrum and evaluation of a spectral section with high-resolution direction 
finding (MUSIC applied to a 9-element UCA 3 m in diameter), (a) Display obtained by a single¬ 
wave beamforming algorithm showing a result of 179°. (b) HR evaluation showing the hidden signal 
impinging at 144° and the correct value of 182° of the stronger signal. (Screenshot by Rohde & 
Schwarz.) 


These errors have similar consequences, yet they have entirely different 
causes. Measurement errors for the array signals can be kept relatively low 
by carefully designing the measurement paths and ensuring good performance 
for internal calibration. Typical values we can attain are around 0.3-dB RMS 
amplitude error and 0.5° RMS phase error for frequencies up to about 3 GHz. 
At higher frequencies, we can expect to see an approximately linear increase in 
these errors, which in comparison with modeling errors, play a secondary role. 

Modeling errors (see next section) represent the greatest chal lenge in imple¬ 
menting HR techniques ferandwood . 1994 : iFriedlanderi Il990l) . For example, 


if we consider the limit of MUSIC’S resolution capability with a circr 


lar a 


ray 

antenna having diameter D, the result, in accordance with lBrandwoodl (119941) . is 
approximately as follows: 


A </> 






irD 


( 2 . 22 ) 


A0 is the angular difference between two signals for the case in which the two 
maxima are no longer formed in the DF function; ap is the standard deviation 
of the phase errors in radians. If we compare the angular difference A</> with 
the null-width BWjjca of a conventionally sh aped directional ch aracteristic, we 
obtain the following, also in accordance with lBrandwoodl dl994l) : 


A 0 




1.98 ^Jop 


BWjjca 


(2.23) 
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FIG URE 2.20 Limit of MUSIC’s resolution capability due to phase errors for different diameters 
of a UCA. 

A standard deviation of the modeling error of, say, 5° (0.0873 radians) and 
an antenna size D = A limits the ability to distinguish two signals to A0 = 25.6°. 
This corresponds to about 60% of the null width of the lobe of a conventionally 
shaped antenna pattern. 

Smaller phase errors and the associated increase in possible resolution 
are difficult to obtain with certainty, particularly over longer time intervals. 
Figure l2!20l shows the influence of phase errors and antenna size on the resolution 
capability according to Equation (12.221) . 

2.10 ERROR SOURCES 

Direction finding accuracy is influenced by a number of factors: 

• Wave propagation (typically disrupted by obstacles such as mountains or 
buildings). 

• Signals radiated by emitters are modulated or limited in time and their 
carrier frequency is often unknown. Also, their polarization is not clearly 
defined. 

• The received field has additional superimposed noise and co-channel inter¬ 
fering signals. 

• Noise caused by components of the DF system. 

• Measurement errors. 

• Modeling errors. 
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2.10.1 Multipath Wave Propagation 

As already mentioned, the simple case of a plane wave seldom occurs in prac¬ 
tice. In a real-world environment, additional waves typically must be taken into 
account. These are generated by other emitters in the same frequency channel 
(incoherent interference) or arise as secondary waves due to reflection, refrac¬ 
tion, or diffraction (coherent in-channel interference). In general, a large number 
of waves are involved. Figure lZ2Tl shows. for example, the azimuth distribution 
of the waves generated by a mobile transmitter in a built-up area. The direct 
wave component with amplitude 1 arrives from an angle of 90°. 

Figure IZ221 shows the resultant wave front in the form of a contour plot for 
phase and amplitude. If most of the waves arrive from the direction of the emitter, 
the bearing error can be sufficiently reduced by increasing the aperture of the 
antenna system. This effect is illustrated in Figure 12.231 for an interferometer 
direction finder. 


2.10.2 Polarization Errors 

The antennas used by the emitters we are tracking will usually have a distinct 
polarization (often vertical), but it is influenced by the antenna support itself. 
Examples include antennas mounted on vehicles, aircraft, and even man-pack 
radios. In the case of direction finding for interference sources, there is no defined 
polarization since it depends on the spatial position of the conductors producing 
the emission. During transmission over the propagation path, the waves are 
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FIGURE 2.21 Distribution of wave amplitudes versus incident angle as produced by an emitter 
located in a suburban area. 
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FIGURE 2.22 Contour plot of electric field strength resulting from superposition of the multi- 
component waveheld from Figure 1^211 



C/3 

CD 

<D 

CD 

w 

O 

5h 

Sh 

w 

Q 



FIGURE 2.23 Bearing error of a 9-element and a 5-element interferometer in a multiwaveheld 
versus antenna aperture. In the small-aperture region the bearing errors are independent of the number 
of elements, but the 5-element array fails in the large-aperture region (D/A, > 1). 


























































Section 


Error Sources 


subject to diffraction and reflection, which results in additional changes in 
polarization. 

In the DF antenna, the antenna elements as well as the support structure 
are excited with field components having nominal polarization direction (e.g., 
vertical or horizontal) as well as those that are cross-polarized. If these cross- 
polarized field components are not sufficiently rejected, bearing errors will arise 
depending on the components’ relative magnitude and phase. 


2.10.3 Internal Noise 

We discussed the influence of internal (instrument) noise in the preceding sec¬ 
tions. This noise is caused by random-charge carrier movements in electronic 
modules, such as amplifiers, mixers, and A/D converters, and in resistors and 
other lossy components. From a systems perspective, it is useful to imagine all 
of the noise components in a processing channel arising in the antenna element. 

This equivalent noise source has the available power 

P n0 = F s kT 0 B (2.24) 

and produces at the output of the processing channel the same signal-to-noise 
ratio with measurement bandwidth B as the noise sources’ actual distribution. Fs 
is the system noise factor for a processing channel, k is the Boltzmann constant 
(1.38 x 10~ 23 Ws/K), and To = 290 Kelvin. 

Since we can assume that the noise sources in the processing channels are 
uncorrelated, for the same transfer power gain Gj and the same noise in the 
processing channels, we obtain the following for the covariance matrix of the 
internal noise: 

R« = a 2 n I = FskToBGr I (2.25) 

where I is the identity matrix. 


2.10.4 External Noise and Interference 


Particularly at frequencies below about 100 MHz, external noise plays a role that 
cannot be ignored. It is caused by thunderstorms (atmospheric noise), machinery 
(human-made noise), and processes in the universe (galactic noise). Figure IZ241 


mF e AdB) = 101 g(F ex ) for external noise 
ITU ( 2007 ). The noise factor F ex corre- 


illustrates the variation in the noise fac 
versus frequency in accordance with 
sponds to the ratio of the available noise power P n Oex at the output of a short 
antenna to thermal noise power at the temperature To = 290 Kelvin, where both 
power levels are measured with bandwidth B: 
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FIGURE 2.24 External noise in accordance with Recommendation ITU-R P.372-9. (Source: llTU . 

20071 with permission.) 


The antenna array’s elements can maximally extract the available external 
noise power P n o e x = F ex kToB from the received wavefield. The actually measured 
external noise power P n R ex is a function of the transfer power gain Gt of the 
processing channel: 


Pnex — Pex Gj kT()B 


(2.27) 


While we can assume that the noise generated in the processing channels is 
uncorrelated, the external noise, which is coupled in, is partially correlated. The 
correlation between antenna elements is a function of the distance rela ted to the 
wave length and the spatial intensity distribution of the external noise (IHudsonl. 


19811) . For example, for the covariance between elements i and j for isotropic 


distribution of the noise sources, we have 


PnexiJ — P 


nex 



(2.28) 


Discrete external interference sources occur in the form of in-band and out- 
of-band interfering signals and are represented as correlated sources in the 
processing channels. In the case of strong out-of-band signals, interference is 
often caused by inter-modulation and reciprocal mixing. 


































































Section 


Test and Measurement Procedures 


2.10.5 Measurement Errors 

Different gain and phase in the receive sections cause bearing errors that become 
greater as the antenna aperture referred to the wavelength becomes smaller. As 
already mentioned, the receive sections of most multichannel direction finders 
are calibrated for amplitude and phase balance with the aid of a test generator 
prior to DF operation. The transmission parameters in each section are measured 
in terms of their magnitude and phase and the level and phase differences are 
stored. During the DF process, the measured values are corrected using the stored 
values before the bearing is calculated. The statistical measurement uncertainty 
that arises as a result of the A/D converter’s limited resolution is normally added 
to the internal noise sources. 


2.10.6 Modeling Errors 

Modeling errors arise primarily because of interactions in the near field of the 
antenna array, that is, 

• Mutual coupling between antenna elements 

• Coupling between antenna elements and mechanical support elements, 
including cabling 

• Reflection and diffraction caused by antenna support structures (e.g., mast, 
housing) 

• Reflection and diffraction caused by objects in the vicinity of the antenna 
array (additional antennas, lamps, power lines, cars, etc.) 


It is relatively easy to take into account the coupli ng between the antenna 


eleme nts either by measurement or theoretical modeling (iFriedlander and Weiss . 


ig 2! 
(iFrii 


19911) . However, it is significantly harder to take into account the influences of 


the support construction and the objects situated in the immediate vicinity of 
the antenna since they are highly dependent on the wave angle of incidence and 
frequency. Moreover, they are themselves subject to influences that are not con¬ 
stant over time and cannot be measured adequately because of their complexity 
(temperature, humidity, corrosion of contacts, etc.). 


2.11 TEST AND MEASUREMENT PROCEDURES 

By examining the test procedures used to measure the relevant performance 
parameters in direction finders, we can build the foundation needed to measure 
and understand the significance of these parameters in real-world applications. 
Based on the divergent requirements for the test environment, we categorize our 
test techniques as outdoor and indoor. 

2.11.1 Outdoor Tests 

With outdoor tests, the focus is on measuring the parameters that are closely 
related to the properties of the antenna system. 
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Test Range 

The test range basically consists of a rotary stand where we can mount the DF 
antenna along with the required number of test transmitters with the transmit 
antenna (Figure 1^251) . 

The size of the test range is oriented primarily toward the required distance d 
between the DF antenna and the test transmitter. This is based on the requirement 
that the electromagnetic field in the area of the DF antenna must approximately 
contain only components present in the far field of the transmit antenna (d —> oo). 
As we know, the surfaces with equal phase are curved in the near field of a 
transmit antenna, and additional field components exist as well. Accordingly, we 
must take both of these circumstances into account when selecting this distance. 

The curvature of the phase planes is independent of the design of the transmit 
antenna if it is small in comparison to the distance being considered. In this case, 
we can consider the transmit antenna as a point radiator approximation. The 
influence of the curvature is then dependent only on the extension A of the DF 
antenna and the wavelength A , since the phase deviation 8 with respect to a planar 
wave increases toward the edge of the DF antenna. Figure 12.261 illustrates the 
underlying geometry. 

For b <A and b<^d we have 



(2.29) 


d 

◄-► 



FIGURE 2.25 Test range for direction-finder measurements. 
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FIGURE 2.26 Geometry underlying the distance requirement. 


The phase deviation at the edge of the DF antenna aperture (width A) is thus 
equal to 


8 = 


jiA 1 
4 dk 


(2.30) 


and the minimum distance for a specified phase deviation is 


d> 


jiA a 


4 8 to iX 


(2.31) 


A point of reference for the tolerable phase deviation 8 to i is provided by 
the tolerances of the antenna array’s edge elements at the shortest operating 
wavelength. For example, if the tolerances are specified as 10° at a frequency of 
1 GHz, 2° of additional phase deviation due to the test distance have practically 
no influence on the measurement result. For a 1-m DF antenna, a distance of 
d > 75 m is thus sufficient. 

For wavelengths that are large in comparison to the aperture, the curvature 
of the wavefronts plays a secondary role. Here, we must watch out for the near¬ 
field components of the field, which represent reactive power and thus prevent 
the phase of the wave from varying in a linear fashion versus the distance. If we 
consider a simple dipol e as our transmit antenna, this range will extend to about 
d/X ~ 1 ( Balanis . 1982 ). 

Special care is needed to avoid intolerable interference with wave propaga¬ 
tion (due to buildings, cables, vehicles, etc.). Clearing only obstacles from the 
first Fresnel zone are definitely not sufficient. The following analysis of a wave 
reflected by an obstacle by 90° with respect to the receive direction of the primary 
wave provides guidelines for the extent to which obstacles must be eliminated. 
In a system with a small aperture (antenna size A <$C A), the reflected wave must 
not exceed an amplitude of approximately 2% of the direct wave so as not to 
cause more than 1° of bearing error in this scenario. Assuming equal intensities 
of the impinging field at the location of the DF antenna and the obstacle, the 
ratio between the amplitude \E r \ of the reflected wave and the amplitude \E^\ 
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ween the antenna and the 


of the direct wave is determined by the distance r be 
obstacle and the obstacle’s radar cross-section, RCS (lBalanisl . 119821) : 


E d 


RCS 

4nr 2 


(2.32) 


With a thin obstacle (wire, mast, etc.) of length and diameter <<C 1, we obtain a 
scattering cross-section, RCS, at its first resonance (/ = A/2) of 


RCS a/2 = A ° k/2 =0MX 2 


TV 


(2.33) 


Dx /2 is the directivity of the scatterer in the case of a thin wire at half-wave res¬ 
onance D\f 2 = 1.64. Considering Equations (12.321 ) and (12.331) and \E r /E d \ mSLX = 
0.02, we must maintain a distance of 


0.26A 


^min = 


\E r /E d \ 


= 13A 


max 


to obtain not more than 1° of bearing error in small aperture systems. 

DF Accuracy 

The DF antenna is arranged in fixed angular positions, and the bearing is recorded. 
We must set the transmit power high enough so that the instrument noise does 
not influence the measurement. Now we can measure at different frequencies 
and with the smallest possible angular steps. For routine tests on antennas, we 
typically measure using 10° steps. The individual bearing errors A ^ are obtained 

/V 

from the difference between the bearing 0;, which we read off, and the setting 
of the rotary table (pposi relative to the test transmitter. 


= (pPOSi ~ 0 / 


(2.34) 


The total error is typically specified in the form of an RMS error. Of course, we 
must specify the sample type involved (e.g., azimuth only at constant frequency 
or azimuth and frequency in a single sample). 

Sensitivity 

Before beginning the actual measuring, we arrange a field strength-measuring 
antenna in place of the DF antenna and measure the transmitter power Ppo 
and field strength Eo versus frequency. For determining the sensitivity of the 
direction finder, we can reduce the transmitter power to a value Pp , which results 
in a specified bearing fluctuation. Based on the ratio of transmit power to field 
strength Eq , we then obtain the sensitivity value as follows: 


Esens — Eq 


Ejl 

Pto 


(2.35) 
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Influence of Polarization 

Polarization measurement is analogous to bearing accuracy measurement. First, 
we determine the bearing for defined azimuth and elevation angles with the 
nominal polarization of the transmit antenna. For linear polarization, we then 
arrange the transmit antenna at an angle corresponding to the desired polarization 
angle. The polarization error is thus equal to the currently displayed bearing 
minus the bearing determined with nominal polarization. It is good practice to 
check the polarization at the DF antenna location as a function of the transmit 
antenna’s offset position, and to prepare a compensation table if necessary. 

Immunity to Multipath Propagation 

For a defined generation of a secondary wave, we set up a second transmit 
antenna. Both antennas are connected to the test transmitter via a power splitter. 
We can vary the amplitude of the signal for the antenna, which simulates the 
second propagation path using a variable attenuator (Figure [223. 

To perform tests for different phase angles of the secondary signal, we use 
a variable phase shifter (e.g., line sections of varying lengths) in addition to the 
attenuator. In this measurement, only the primary antenna is involved in direction 



FIGURE 2.27 Setup for measuring immunity to multipath propagation. 






























90 ) Chapter Practical Aspects of Design and Application of DF Systems 


finding at first. Then the secondary path is connected with a certain amplitude 
ratio and the phase of the signal is varied until the maximum bearing error is 
attained. 

Angular Resolution 

The test setup is the same as that used for simulation of multipath propagation. 
For testing with uncorrelated signals, the second antenna is supplied by a separate 
test generator. The bearing accuracy as well as certain other parameters can be 
measured as a function of the angular distance for the two transmit antennas. 


2.11.2 Indoor Tests 

Measurements not directly affected by the antenna properties are easier to make 
indoors. This is true in particular for measurements in which immunity to large 
interfering signals must be verified. 


Antenna Simulators 

If we also want to test the DF system indoors in DF mode, we must substitute an 
antenna simulator for the DF antenna array. The relative positions of the antenna 
elements with respect to the wave angle of incidence are replaced by different 
delay lines (Figure [2.28b . The decoupling networks, calibration signal feeds, 
interface modules, and so on, from the original antenna are used. 


From Test 
Generator 1 


From Test 
Generator 2 


Delay Lines 
Simulating 



Calibration 

Signal 


RF 


I To DF 
Processor 


Control 


FIGURE 2.28 Antenna simulator for two different angles of wave incidence. 
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FIGURE 2.29 Setup for testing immunity against strong interfering signals. 


Immunity to Strong Interfering Signals: Blocking, 

Reciprocal Mixing 

Figure (2]29] shows the relevant test setup. A generator at the center frequency 
fs simulates the desired signal with the wave angle of incidence 0y. The sec¬ 
ond generator at the center frequency // provides the interfering signal with the 
wave angle of incidence 0/. In the measurement, the result data for the desired 
signal is ascertained first (bearing, amplitude, signal-to-noise ratio, etc.). Next 
the interfering signal is connected and its power is increased until the maximum 
allowed influence on the result data is attained. 

Inter-Modulation 

The test setup from Figure l229l is also used here. Instead of the desired signal at 
fs, however, a second interfering signal at //2 is generated. Evaluation is carried 
out at frequency/^ = fj\ ±fn I to measure the second-order inter-modulation, and 
at f s = |2//i —fn\ or fs = \fn— 2fo\ to measure the third-order inter-modulation. 

Dynamic Range 

The lower limit of the dynamic range is determined by the minimum power level 
of a desired signal for which certain performance data is fulfilled (e.g., a certain 
bearing fluctuation or a certain signal-to-noise ratio in the spectrum display). 
The upper limit is determined by the power of the signals that degrade the 
performance data of the desired signal by a defined amount. This measurement 
is performed analogously to the measurements previously described. 
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Calibration in Array Processing 


Mats Viberg, Maria Lanne, Astrid Lundgren 


3.1 INTRODUCTION 


High-resolution direction-of-arrival (DOA) estimation has been an active area of 
research since the late 1970s. Its applications are wide ranging, including passive 
listening arrays, radar and sonar, and spatial (or space-time) characterization of 
wireless communication channels. Conventional beamforming techniques for 
direction estimation are limited by the aperture (or physical size) of the array. 
In contrast, parametric methods promise an unlimited resolution under ideal 
conditions (i.e., no noise or modeling errors). These methods take advantage 
of a precise mathematical m odel of the received array data, due for example to 
incoming plane waves, fSee iKrim and Viberd. 1 19961. for an overview of DOA 
estimation methods; and Trees . 20021 for details regarding array signal process¬ 


ing.) In practice, resolution and estimation accuracy is limited by noise as well as 
errors in the assumed data model. The focus of this chapter is thus on modeling 
errors and, in particular, calibration techniques to mitigate them. 

The most natural and common approach is to measure the spatial response 
of the array in an anechoic chamber. These calibration measurements are then 
used to update the data model, either in the form of explicit unknown param¬ 
eters or nonparametrically. Under favorable conditions it is also possible to 
estimate the response model together with the unknown directions—so-called 
auto-calibration. The purpose of this chapter is to give an overview of exist¬ 
ing techniques and discuss their respective pros and cons. Some more recent 
developments and interpretations are also included. 

It should be mentioned that calibration of a real-world antenna (or other 
sensor) array also involves hardware adjustments to compensate for temperature 
drift and the like. The methods considered here can be classified as software 
calibration , where errors are handled by adjusting the assumed data model rather 
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than con 
given in 


acting for it. An examp le of a hardware-based calibration procedure is 


Sakaguchi et al. 


( 120021 ) . 


By array calibration we refer to the procedure of correcting the assumed 
array response model for errors due to several imperfections. These can include 
gain and phase errors in the receivers, imbalance between the I and Q channels, 
an incorrectly specified mutual coupling model, and uncertain sensor locations. 
The correction uses measured data. The least demanding approach from a data 
collection point of view is auto-calibration, where array response parameters 
are estimated jointly with the DO As using the same data. Clearly, this requires 
a precise specification of the error model, which in turn cannot be “too com¬ 
plicated.” Fundamental limitations of joint estim atio n of sensor positions and 
DOAs are presented in Levi and Messer ( 1 990h and lRockah and Schultheiss 
(119871). Results for gain an d ph 
Paulrai and Kailathl (11985h . and 


se estimation can be found in Astelv et al. 
Weiss and Friedlander ( 19901) . A useful 


<19991) , 


overview 


of auto-calibration is given in lLi et al. ( 2003 ). 

When calibration data is available, using sources at known positions, auto¬ 
calibration methods can of course be directly applied with fixed DO A parameters . 
Limitations and method s are discussed in various articles (e.g., EE and Nehorail . 
1995l : lNg and See . 1996 ). Another interesting approach that does not require any 
specification of the ar ray response model is the array interpolation, or manifold 
separation, approach ( Doron and Doron . 1994a lb c). The idea is to use a Fourier 
representation of the array response, which has the advantage of allowing sim- 


Imation approaches developed especially for a uniform linear arra y 
Barabell . 1983 : Roy and Kailathl 1 19891: Stoica and Sharmail 1990l) . 


pier DOA es 
(ULA) (e.g., 

A more recent contribution is lBelloni et all (120071) . where the array interpolation 


rithm ( BarabellL 19831). A different approach applicable also when the number o : 

calibration DOAs is s 

parse is interpolation using a nominal model (Lanne et al.. 

20061: Ottersten et al.. 

1992b). A simple linear interpolation of a correction factor 


to the nominal model is often sufficient, although a smoothed interpolation can 
be beneficial, especially if the “true” correction is smooth or the calibration data 
is unreliable. 

A natural question in this context is how errors in the calibration data propa¬ 
gate to errors in the DOA estimates, and how this effect compares to errors due to 
n oise in finite samples. A t heore tical study of accuracy requirements is provided 


in 


Porat and Friedlander ( 


19971) for the case of parametric calibration. A gen- 


eral sensitivity analysis of DOA estimation algorithms is also a useful to ol (e.g., 
Swindlehurst and Kailath . 19921 1993 : Viberg and Swindlehurst . 1994a ). where 


the combined effects of modeling errors and finite samples are considered. Exper¬ 
imental results fo 


in 


Dandekar et al, 


: anten n a arrays using calib r ation measurements are pres ented 
( 2000l) . lGupta et al' (12003 ). Pettersson and Grahn ( 2003 ). and 


Pierre and Kaveh dl995 ). although these are mostly based on global models. Here 


we present a performance comparison between the various approaches based on 
simulated data using more general array error models. 














































































































































Section 


Data and Error Models 


Section 3.2 describes the data model under ideal as well as more realistic con¬ 


ditions. Next, selected methods for DO A estimation are reviewed in Section 13. 3 


Section [3Al introduces the auto-calibration approach, and Section [331 discusses 
methods that rely on calibration data. Array interpolation techniques with and 
without availability of a nominal model are discussed in Section 13.61 The dif¬ 
ferent calibration approaches are then compared in terms of applicability and 
performance in Section [3771 The chapter concludes with Section [3781 


3.2 DATA AND ERROR MODELS 

This section presents the data models and the underlying assumptions of the 
chapter. We start with the ideal case and then move on to a more realistic model 
for an electromagnetic antenna array. Various sources of error are also pointed 
out, and their modeling is discussed. 

3.2.1 Ideal Data Model 

The underlying problem in this chapter is that of estimating the DOAs of incom¬ 
ing signals from distant sources (Figure [37Tb . The different sensors will capture 
the same signals but with time delays that depend on the DOAs. Throughout 
this chapter, we will assume that the transmitted signals are narrowband with 
the same center frequency co c . After the front-end electronics (usually including 
down-converting to an intermediate frequency), the signal is digitized (analog- 
to-digital conversion, ADC); the direct down-conversion (DDC) block then takes 
the signal to baseband and extracts the real (in-phase) and imaginary (quadrature- 
phase) components. The resulting signal at sensor m with sample index n is 
denoted x m (n), for m= 1,... , M. The sample index n corresponds to an arbi¬ 
trary sampling instant t n (in seconds), and all sensor elements are assumed to be 
sampled at the same time instant. A batch of N time samples is assumed avail¬ 
able from each sensor. The resulting data is collected in the array output vector 
x(n) = [x\(n),...,XM(n )] T , n = 1,..., N. In the narrowband case, a time delay 
r (seconds) results in a phase shift of co c r (radians) at the baseband, where co c is 
the center (or carrier) frequency. Thus, the noise-free array output from an ideal 
sensor array, due to a single signal source from the DOA 0, can be modeled by 
the linear relation 


x(n)=a(0)s(n) (3.1) 

If we define the signal waveform s(n ) at some reference location (the array phase 
center), then the array response vector a (0) takes the form 

a (<9) = [ e ^ T ' (e \ e Jc ° c ™ m f (3.2) 

where r m {0) is the 0-dependent time delay of the signal at sensor m relative 
to the reference. Under the assumption of linearity, the super-position principle 
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xO) 


FIGURE 3.1 Plane waves from far-field sources arriving at an antenna array. The problem of 
interest is estimating the directions to the sources. 


applies, and the sensor output due to Q far-field emitters can be written as 

Q 

x(«) = ^2‘M9 q )s q (n) + n(n) (3.3) 

q= 1 

where 0 q are the DOAs of the signal sources, s q (n) are the corresponding signal 
waveforms, and n(n) represent an additive noise term. The noise represents 
components in the actual array output that cannot be well modeled by the simple 
relation (13.1b —for example, external noise sources that are not pointlike and 
thermal noise in the receivers. The model (13.3b can be put in compact matrix 
form by defining the array response matrix A(0) = [a(0i),..., a(0g)], where 0 = 
[01,..., 0q] t is the vector of DOA parameters, and the signal vector s (n) = 
[s\(n),..., sq(ti)] t . The resulting model is 


x(n) =A(0)s(n) +n(n), n=l,...,N 


(3.4) 
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or, more compactly, 

X = [x(l),..., x(A0] = A(0)S + N (3.5) 

Most methods use only second-order properties of the data. The array 
covariance matrix is defined as R = E[x(n)x* (n)], where E[-] is the statistical 
expectation and (•)* denotes the complex conjugate and transpose (the Hermi- 
tian operator). Assuming the signal and noise to be independent and the noise 
to be spatially white, (E[n(ri)n* (ri)] = cr 2 1), the array covariance matrix takes 
the form 

R = A(0)PA*(0)+ct 2 I (3.6) 


where the signal waveforms are assumed stationary so that the signal covariance 
matrix P = E[s(n)s* (n)] is independent of n. The array covariance matrix is 
estimated from the available data by the sample covariance matrix 


1 N 1 

R = — V x(n)x* (n) = - XX* 
N ^ N 

n =1 


(3.7) 


Since the chief interest here is modeling errors, we frequently assume that 
N —>► oo, so that the sample covariance matrix is a perfect estimate of R. Note 
that this holds also in the case of quasi-stationary deterministic signals, where P 
then denotes the limiting sample covariance matrix of s (n). 

The DO A estimation problem is now stated as follows: Given an observation 
of X, modeled by Equation d3.5b . determine the DO A parameters 0. An impor¬ 
tant partial problem is of course to determine the number of sources Q , which 
is the dimension of 0. Another problem of great interest, in particular in com¬ 
munication applications, is to estimate the signal waveforms in S. However, the 
focus in t his chapter is on DOA estimation. Method s for esti mating Q are avail¬ 
abl e (e.g., Wax anc iCailath . 1985 : ZhaoetaL, 1986b . whereas Er and Ngl (Il994 ) 
and lOttersten et al.1 (11989b address the signal waveform estimation problem (see 

.is publications on “ smart antennas” for wireless communication 


Paulrai et al. 


2003b . A key issue in this context is that the func- 


also the vario 
(for example, 

tional form of the array response a (6), often termed the army manifold , must be 
accurately known. This is our main topic and various approaches to acquire this 
information are discussed. Perhaps the most natural start for an engineer is to 
revert to the underlying physics. 


3.2.2 Real Antenna Array Modeling 


Mutual coupling 

can have a significant ini 

luence on arrav performance in 

beamforming (IJosefsson and Persson. 19991: 

Stevs 

cal and Herd|. |l990b and on 

DOA estimation if it is not correctlv accounted for ( 

Svantesson.il 998k Swindle- 

hurst and Kailath, 

1992). The strength of the mutual coupling is mostlv deter- 


mined by the element type (its radiation characteristics), the distance between 
the elements, and how the elements are oriented relative to one another. 
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Mutual coupling is commonly represented by its S-parameters, which are 
the elements of the scattering matrix S. Scattering matrix formulation is a well- 
known concept from circuit t heory, which r elates the output waves to the input 
waves of an A-port junction ( Collin . 19921 chapter 4.7). The concept is easily 
applied to array antennas by assigning one port to each antenna element, as 
in Figure lT2l If the applied excitation is called v + , the reflected waves are 
given by 


or 



(3.8) 


" V 1 


Ml M2 ... MM 


ri 

v“ 

— 

Ml M2 ••• Mm 

• • 


v+ 

- V M - 


• 

_ Mfl Mf2 ••• S MM _ 


- V M - 


(3.9) 


In a transmitting array antenna, the backward-moving waves v - are caused by 
mutual coupling from the surrounding elements and the mismatch in the element 
itself. Similarly, in a receiving array the backward-moving waves constitute the 
received voltages coming directly from the source as well as reradiated energy 
due to the mutual coupling. The received antenna voltages due to a far-field 
emitter from DOA 0 can be modeled by 


v =Ca {0)v isol 


where a (0) is the purely geometrical steering vector, v- ; is the received voltage 
for an isolated element, and C is the mutual coupling matrix, which describes 



FIGURE 3.2 Mutual coupling and scattering parameters. 
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how the received signals change as a result of mutual coupling. Assuming an 
array of dipoles where the dipole self-impedance is equal to the characteristic 
impedance of the feeding lines, it can be shown that 

C = I-S 


The corresponding deriv ation of the coupling matr ix for an array of waveguide 
openings can be found in lStevskal and Herd ( 1990h and leads to 


C = I + S 


If the mutual coupling is known, it can be included in the data model by 
changing the steering vectors such that 


m -* a c (o)=cm 


(3.10) 


If the mutual coupling matrix is unknown, or only approximately known, 
an error is introduced in the data model. In addition, there are uncertainties 
in the array response model caused by limited manufacturing accuracy and 
insufficient isolation between ideal components and nonideal components. It 
should also be mentioned that the incoming energy to a receiving array can 
include other unmodeled phenomena, such as near-field scattering and multipath 
propagation. 

The main sources of array modeling errors are summarized as 


• Mutual coupling unknown or misspecified 

• Antenna element positions and orientations not perfectly known 

• Gain and phase imbalances 

• IQ imbalances in receivers (generally small) 

• Near-field scattering due to platform or terrain 

• Internal leakages in the system 

• Quantization in phase shifters, attenuators, and ADCs 

• Nonlinear components 


For m o re informat i on on array errors, see lAllenl (119641).lElliottl (1958h . lMailloux 
( 2005b . Pettersson ( 1992 ). and Swindlehurst and Kailath ( 1992 ). 

The remedy for array modeling errors is array calibration. The calibration 
methods considered here are mostly based on calibration measurements. If the 
array manifold is assumed to be time-invariant, methods based on characteriza¬ 
tion at a measurement range are sufficient. On the other hand, to detect changes 
with time (and, for example, temperatur e), online calibration meth ods are 


needed. These include signal in 


coupling-based measurements (lAgrawal and Jablonl . 12003 


ection (e.g.J ^anne and Drackned.1200 3b. mutua l 


Mimann et al.. 


1989b . 


and auto-calibration (see Section 13.41) . Only characterization measurements 
using far-field sources at known positions and auto-calibration techniques are 
considered in this chapter. 
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3.3 DIRECTION-OF-ARRIVAL ESTIMATION 

The purpose of this section is to present some of the most influential meth¬ 
ods for DOA estimation, since this is the intended use of array calibration . 
The review p resented here is far from complete (see Krim and Viberg ( 1996b . 
Treesl (120021) . and the other chapters of this book for more details and meth¬ 
ods). The DOA estimation problem is restated as follows: Given observations 
X = [x(l),..., x(A)], determine the DOAs 0 = [0\ ,..., 0q\ of the incoming sig¬ 
nals. The array response model a (6) is assumed to be a known function of 0 
over the DOA range of interest. The response may be computed from a detailed 
electromagnetic simulation, for example, or, as in this case, by a calibration 
procedure. 


3.3.1 Classical Beamforming 

Perhaps the most natural approach to DOA estimation is to coherently combine 
all sensor outputs, assuming a source at a hypothetical DOA 0, and measure the 
resulting power. The field of view is then scanned, and the peaks in the resulting 
spatial power spectrum estimate are taken as the DOA estimates. The coherent 
combination is a *(6)x(n) (a spatially matched filter), and the average power is 
given by 


1 N 

p SF(8) = jjJ2 

n= 1 


a*(0)x(«)| 


(3.11) 


For an ULA, the propagation vectors take the form 


a(0) = [l 




(3.12) 


where 0 = /cA sin 0 is called the electrical angle. Here, k = ol> c /c is the wavenum¬ 
ber, c is the speed of propagation, and A is the inter-element separation. The DOA 
6 is taken relative to the array broadside. Thus, Equation (13.111) is in this case 
the length-M periodogram, averaged over N realizations. Since the method only 
requires finding the Q highest peaks of the one-dimensional spectrum Pbf(0), 
it is computationally attractive. However, beamforming shares the drawbacks 
of other Fourier-based techniques: limited resolution and possible masking of 
weak signals. For the UFA case, a single source gives rise to a mainlobe of width 
A0 ~ 2ir/M, which is the classical Fourier resolution, and a maximum sidelobe 
13 dB below the mainlobe. The sidelobes can be reduced by windowing, at 

a, in 


ligh-SNR scenarios, be 
19951) . However, beamform- 


the cost of poorer resolution, and th e resolution ca 
improved by adaptive beamforming (ISerebrvakovL 
ing methods are in general unable to fully exploit the data model and achieve 
correct estimates (consistency) as N -> oo. 
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3.3.2 Nonlinear Least Squares 

Another very natural approach to parameter estimation is least squares (LS). 
In other words, the method tries to find the parameters that best match the obser¬ 
vations in a least-squares sense. Given the model (13.51) . with unknown parameters 
0 and S, the LS estimator is formulated as 

,2 


{0,S} = argmin||X-A(0)S||^ 

0,S 


(3.13) 


where || • \\p is the Frobenius matrix norm. This is usually termed nonlinear least 
squares (NLLS), since the DOA parameters enter nonlinearly in the criterion 
function. It is noted that the method is identical to maximum likelihood (ML) 
if the signal waveforms s (n) are regarded as unknown deterministic parame¬ 
ters. Therefore, the term deterministic ML (DML), or conditional ML (CML), 
is also used. For a fixed (but unknown) value of 0, the criterion is linear in 
the signal matrix S. Thus, it belongs to the class of separable NLLS problems 
( Golub and Perevral . 120031) . Suppressing the 0 dependence of A for notational 

A | 

convenience, the LS solution is S(0) = (A*A) i A*X. Substituting this back into 
Equation (13.13b . the NLLS estimate of 0 is found by 

_L' 


0 = arg min Tr{P^R} 
0 


(3.14) 


I 1 A 1 . 

where P^ =A(A*A) _I A* is an orthogonal projection, and R=^XX* is the 
sample covariance matrix. 

Note that (13.14b reduces to the classical beamforming estimate for <2=1 
and A = a(0). However, for Q > 1, the NLLS method is generally superior. The 
resolution does not only depend on M, but can be arbitrarily good for high 
enough signal-to-noise ratio (SNR) or N. The drawback is that Equation (13.14b 
is a g-dimensional nonlinear optimization, which is in general not convex. 


A popular implementation is based on solving 

or the parameters of one 

source 

at a time (Fessler and Hero. 1 1994: Fleurv et al. 

1999; 

Ziskind and Wax. 

1988) 


in an iterative search. Once sufficiently good initial estimates have been found, 
it is more efficient to switch to a local optimiza tion using a Newton-type method 
( Golub and Perevral. 12003 : fviberg et al Itmlb. 

We can alternatively model the signal waveforms as stochastic and Gaussian- 
distributed. The resulting maximum likelihood (ML) estimator can in some 
scenarios outperform NLLS, par ticularly when the sign als a re highly correlated 
Interes ted readers are referred to 
( 199Ch 'or details. 


Ottersten et al. 


dl992al ) and lStoica and Nehorai 


3.3.3 Subspace Methods: The MUSIC Algorithm 

Although the NLLS method can produce excellent statistical properties, its 
computational complexity motivates alternative approaches that are not only 
cheaper but superior to conventional beamforming. One important class of such 
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approaches is subspace methods. The underlying observation is that if the signal 
matrix S has full rank, the noise-free data X=AS is confined to the span of the 
columns of A, denoted span {A}, which is therefore termed the signal sub space. 
It is easy to show that the signal subspace is also spanned by the Q princi¬ 
pal eigenvectors of the array covariance matrix R, defined in Equation (13.6k 
Thus, the signal subspace is estimated from the available data by the Q principal 
eigenvectors of the sample covariance matrix 


M 


R =X X ***** = e * a A + e „ a „ e « 


(3.15) 


k =1 


where E v = [ei,..., eg] contains the Q principal “signal eigenvectors.” Ide¬ 
ally, span{E } = span{A}, and because E and E„ are orthogonal, it holds that 


spanjE ) _Lspa njAj. In the MUSIC algorith m ( Schmidt . 19791. reprinted in 
SchmidtL 19861 : see also iBienvenu and Koppl 1 19791) . this relation is exploited 


by forming a “pseudo-spectrum” 


Pmu(0) = 


|a(0)|| 


|E*a(?>ll : 


(3.16) 


The MUSIC DOA estimates are now the locations of the Q highest peaks 

/V 

of Pmu(9)- Since E v —> E^ as N -> oo in a suitable statistical sense, the MUSIC 
spectrum can resolve arbitrarily close sources for large enough N (or SNR). Yet 
it only requires a one-dimensional search similar to the classical beamforming 
approach. The one additional burd en is the eigendecomposition of R, w hich can 
be computed efficiently if Q <<C M (IXu and Kailathl. 1 994lYand. 1995D . For this 
reason, the method and its variations (e.g.. Gershman and Bohme. 1997: Lee and 


Wengrovitz. il 990 : IXu and KavehLll996l) have become very popular since being 
introduced in the late 1970s. We reiterate that the signal matrix needs to have full 
rank for the MUSIC method to work (i.e., coherent sources cannot be handled). 
For ULAs, so-called spatial smoothing and forward-ba ckward averaging tech¬ 
niques have been propo sed to overcome this limitation dFriedlander and Weissl . 
1992l : IShanetZlll985h . 


3.4 AUTO-CALIBRATION TECHNIQUES 


The purpose of this section is to give a brief introduction to the auto-ca libration 


approa ch. A different exposition including more methods is available in lLi et al 


(120031) . Common to all of these techniques is that a parametric model of the array 
response, which includes some “array parameters,” is exploited. Formally we 
have a = a (0, p ), where p = [p \,..., pp] T is a vector of P unknown parameters 
associated with the uncertainty in the response. As an example, consider the case 
of isotropic and identical antenna elements at unknown positions (x m , y m ) in the 
xy-plane. The DOA parameter is the azimuth angle, taken clockwise relative to 
the y-axis. For simplicity, it is assumed that all sources are in the same plane as 
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the sensors; that is, the elevation angle is ignored. In this case, the array response 
is modeled as 


a (0, p) = [e jK(x istoe+ttcosfl) 


e jK(x M sin 0+y M cos 9) i T 


(3.17) 


Thus, P = 2M and p = [x\,y\,... ,XM,yM] T (although, as will be pointed out 
later, not all of these parameters can be determined without additional knowl¬ 
edge). Another case of interest is calibration of gain and phase (i.e., complex 
gain). Again assuming a planar array, the array response is modeled by 


a(0, p) = \ yi e jK{x ism^cosO 


y Me J K ( x M sin 0+y M cos 9) j T 


(3.18) 


If the sensor locations are known, the P = 2 M real-valued array parameters 
can now be taken as 

p = [3ft{yi}, 3{yi},..., 3{y M }] r 

where Tt{-} and ^{-} denote the real and imaginary parts, respectively. Equiva¬ 
lently, one can use the magnitudes and phases of the y m s as parameters. Also, in 
the case of auto-calibration with unknown complex gain, additional constraints 
are necessary to guarantee unique estimates. 

A third case that has been extensively studied is estimation of mutual coupling 
parameters. From Equation (13.101) . the model can be expressed as 

a(0, p) = C(/>)a o (0) (3.19) 


where C (p) is the mutual coupling matrix (MCM) and ao(0) is the “coupling- 
free” array response. The MCM is usually taken as a linear function of the array 
parameters p , with a certain structure that depends on the sensor properties and 
the array geometry. Because of reciprocity, C (p) is always a symmetric matrix, 
and it is reasonable to assume that distant sensors are uncoupled so that C (p) is 
banded. For a ULA, the MCM is usually modeled as Toeplitz, which is a decent 
approximation for large arrays. 


3.4.1 Identifiability 


In the auto-calibration approach, also termed self-calibration, all parameters are 
estimated simultaneously. This can in principle be done using any of the avail¬ 
able parametric methods for DOA estimation. However, a crucia l question is 
identifiability of the parameters. I mRockah and Schultheissl (119871) . it is shown 
that if the position of one sensor is fixed along with the direction to a second sen¬ 
sor, the positions can be estimated uniquely along with the DOAs in the model 
(13.171) —provided there are at least three so urces present and t he sensors are not 
collinear. More general results are given in lLevi and Messed ( 


1990b . including 


the possibility of some sources at known locations. If the sensors have unknown 


DOA-independent complex gain factors, th ese can be de 


the DOAs under fairly general conditions (lAstelv et al. 


ermined together with 


19991) . However, the 
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array needs to be nonlinear. If this is not the case, the DOAs can be deter¬ 
mined only up to a rotational ambiguity (i.e., the differential DOAs are unique). 
A more ge neral necessa r y con dition for the case where a (6, p) is linear in p 
Flieller et al. ( 19951) . It at to estimate P unknown array 


is given in 

parameters requires that P <Q(M — Q). This condition is generally not sufficient 
in all scenarios, but may serve as a useful guide for investigating a particular 
parameterization. 


3.4.2 Algorithms 

Most of the methods proposed in the literature fall into two categories. The first is 
based on an extension of the MUSIC algorithm to handle auto-calibration. The 
idea is to maximize the MUSIC pseudo-spectru m over all parameters simul¬ 
taneously (a different approach is presented in IPaulrai and Kailath . 1985 ). 
The resulting spectrum is 


Q 


pmu(q, p)= 


l | a (^,/))|| 2 
“[ || E * a ( 0 <? ,/ o )|| : 


(3.20) 


The parameter estimates are found by performing a simultaneous maximization 
of Pmu(0 , a) with respect to o and On. In the original algorithm, from Weiss and 
Friedlander (119901) . the normalization by \\a(0 q ,p)\\ 2 is ignored and the “null 
spectrum” to be minimized is expressed as 

Q 

Pmu(0, P) = ^a*(0£, ^)E„E n a(^, p) (3.21) 

< 7=1 

The advantage of this formulation is apparent for the case where a (0, p) is a 
linear function of p (i.e., for channel or mutual coupling errors). For fixed values 
of {0q}® =l , Equation (13.211 is a quadratic function in p , which can be minimized 
explicitly. To clarify this, note that a linear parameterization enables us to express 
the array response vector as 


a(0, p) = M(0)p 


(3.22) 


for some M xP matrix-valued function M(0). Thus, 

Q 

Pmu(0 , p) = J2p* M *^AK m (^)P = P*M(^P (3-23) 

< 7 = 1 

where 

M (0) = ^M*(0 9 )E n E>(0 ? ) (3.24) 

< 7=1 


The minimization of Equation (13.231) w.r.t. p requires a constraint to avoid the 
trivial solution. Depending on the application, a norm constraint ||/o|| = 1 or a 
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linear constraint, such as p\ = 1, can be used. Given the constraint, the mini¬ 
mization of Equation (13.231) results in an updated value of p. Next, for a fixed 
value of p , the minimization of Equation (13.21b w.r.t. {0 q }^ =1 reduces to the 
standard MUSIC algorithm. Thus, the joint minimization can be performed itera¬ 
tively, updating p and 0 alternatingly until convergence. The iterations are started 
by estimating 0 using a “nominal value” for the array parameters p. Since the 
joint optimization problem is nonlinear, the initial value needs to be sufficiently 
accurate to guarantee that the global optimum is found. 

A similar appr oach is used to simultaneously estimate DOA and mutual cou¬ 
pling parameters (IFriedlander and Weisslll99ll) . Provided the DOA and array 
parameters are jointly identifiable, the same method is effective for calibrating 
any array parameters that enter linearly in the response vector. 

Basic Weiss-Friedlander (W-F) auto-calibration is simple and computation¬ 
ally efficient for the case of linear array parameters. However, performance is 
limited by that of the MUSIC algorithm, and in particular the sources must be 
resolvable by MUSIC even when the nominal array parameters are used. A more 
direct approach is to estimate all parameters simultaneously in an NFFS or ME 


setting. The extension of Equatio n d3 .1 4b to the case of un 
is conceptually straightforward ( Weiss and Friedlanderi 


mown array parameters 


1989|): 


{0, p, S} = arg min ||X-A(0, /o)S||| 

0,p, S 


(3.25) 


The criterion is still quadratic in S, so the separable solution (13.14b is still 
applicable. Provided all parameters are identifiable, Equation (13.25b is expected 
to give the best possible performance. Yet the identifiability problem for auto¬ 
calibration methods is quite restrictive and limits their practical use. In short, the 
simultaneous parameter estimation of 0 and p is often severely ill-conditioned. 
This situation can to some extent be remedied if approximate values /o 0 of p 
are known. The array parameters can then be modeled as random with a certain 
a priori distribution. The simplest and most common approach is to assume 
a Gaussian distribution, where only the covariance matrix C p = E[(p — p 0 ) 
(p — p 0 ) T ] (or C p = E[(p — p 0 )(p — p 0 )*], in the case of complex circularly sym¬ 
metric error parameters) needs to be specified. With these assumptions, it is 
possible to put the joint estimation in a Bayesian framework, where p are treated 
as nuisance parameters. The simplest and most straightforward possibility is a 
joint maximum a posteriori (MAP) auto-calibration approach. Separating the 
criterion with respect to the noise variance o 1 and the signal waveform matrix 
S, the MAP criterion can be expressed as 


1 


{#> P) = argmin NM log Tr(P^R) + -(/> — /> 0 ) r C p 1 {p — p 0 ) 

0,p z 


(3.26) 


It is interesting to interpret Equation d3.26b as a regularization of the suppos¬ 
edly ill-conditioned NLLS criterion. In theory, this means that the identifiability 
problem has been overcome. However, it should be stressed that for the case 
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where joint modeling is nonunique without regularization, the array perturba¬ 
tions p — Pq need to be “small enough.” This is because the information from 
the data can only move p 0 a small distance in the direction of the true array 
parameters, and thus can only reduce the effect of the modeling error on the 
estimation of 0 to a limited extent as compared to using the nominal value p 0 
as if it were true. In contrast, when the joint parameterization is unique, the size 
of the array perturbations is essentially irrelevant provided the true minimum 
of the respective criteria can be found. Alternative approaches to MAP-based 
auto-ca librati on are presented in Jansson et al. ( 19981) . Viberg and Swindlehurst 
( 1994bl) . and Wahlberg et al. (Il991 ). using a stochastic model of the signal 
waveforms. 


3.5 CALIBRATION USING SOURCES AT KNOWN 
POSITIONS 

In the previous section we discussed methods that attempt to calibrate the array 
using the same data as used for DOA estimation. As was seen, this requires a 
parametric model with a known structure and relatively few parameters. This 
is not always a practical assumption. A useful remedy is to collect additional 
calibration data , normally in an anechoic chamber prior to the actual use of the 
antenna array. In most cases a single emitter is used, placed at several known 
positions. The calibration data is used to compute array response vectors for the 
different source directions. The calibration data for a wavefront coming from the 
DOA f) c is modeled by 

x c (t) = SL c s c (t)-\-n c (t ), t=l,...,N c ; c=l,...,C (3.27) 

where a c = a(r/ c ) is the “true” array response vector at r \ c , C is the number of 
emitter positions, and N c is the number of data samples taken at position c. We 
distinguish between two cases of interest: 

Coherent calibration . In this case s c (t) is known, and the array response vectors 
are estimated as 


a c 


E^i* c (*K(Q 

Ef=i MO I 2 


(3.28) 


This results in perfect calibration vectors for N c -> oo, even if the noise in 
the calibration data is not spatially white. 

Noncoherent calibration. When s c (t ) is unknown, for example because of syn¬ 
chronization problems, the array response is instead computed from the 
principal eigenvector of the sample covariance matrix 


1 Nc M 

= t - XX(ox*«= 

C t=l k= 1 


a c ocei 


(3.29) 
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In this case, the resulting estimates have a gain and phase ambiguity, which 
may need to be taken into account when the calibration vectors are used. 
Also, the noise needs to be spatially white to give consistent estimates. 
If this is not the case, data with no sources present can be used to deter¬ 
mine the noise color, after which a prewhitening can be applied. For high 
enough SNR, the noise color can of course be ignored. 


3.5.1 Parametric Methods 

If a parametric model for the array parameters can be assumed, approaches 
similar to auto-calibration can be applied. However, since they are used with 
calibration data with sources at known positions, much more flexible error models 
can be handled. A straightforward approach is to apply an LS fit of the model to 
the calibration vectors. In the coherent case, where a c is computed according to 
Equation (13.281) . this leads to 


c 


p = argmin V" ||a c — a(r ] c , p) || 

P C— l 


(3.30) 


For noncoherent calibration data, where a c is given by Equation (13.281) . a complex 
gain factor must be included: 

c 

{p, y) = argmin V] ||a c - a (r] c , p)y c || 2 (3.31) 

Pr , 

C— 1 


See Ng and Nehorail ( 1995b and Ng and See ( 1996 ) for extensions and per¬ 
formance analysis with application to sensor position uncertainties. 

Parametric calibration is of course simplest when the array parameters enter 
the array response linearly. In this context, the linear model is often expressed as 


a c = Qa(^ c ) 

When Q is independent of the DO A 77 , Equa 
model. Two cases are of special interest (e.g., 


(3.32) 


ion (13.32 ) is a global calibration 


Pierre and Kavehl.119951) 


• Q is a full matrix, which can model, for example, channel and mutual coupling 
errors. 

• Q is a diagonal matrix, which can correct for channel errors only. 

In both cases the correction matrix is determined using an LS fit. Assuming 
coherent calibration, we have 

l2 (3.33) 


Q = argmin ||A C - QA (jj c , M F 

/V 

where A c = [ai,... ,a c] andA(0 c ) = [a(? 7 i),..., a(?7c)]. When Q is a full matrix, 
the solution is 

Q = A c A *( j / c )( A ( j / c ) A *(»/ c )) _1 


(3.34) 
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Note that C>M is necessary to guarantee a unique solution. In the diagonal case, 
Q = diag{q}. The elements of q = [q \,..., qM] T are estimated separately as 

=A c , m A* (>/ c )(A m (j/ c )A ^(>/ c )) -1 , (3.35) 

/V 

where A c?m and A m {rj c ) denote the rath row of the respective matrix. These 
global calibration methods are simple and efficient. Their main drawback is that 
they cannot handle DOA-dependent errors due for example to uncertain element 
positions. 

3.5.2 Interpolation of Calibration Data 

Provided the calibration vectors are accurately measured, we can consider the 
array response to be essentially known at the calibration points. We can thus 
regard the calibration as an interpolation problem. The most direct approach is 
to interpolate the real and imaginary parts of the M elements of the measured 
calibration vectors {a c (^ c )}f =1 to any desired DOA 0 using linear or spline 
interpolation. In general, this requires a very dense calibration grid since the 
array response is usually designed to vary rapidly with the DOA (necessary 
to enable high-resolution DOA estimation, which is the array’s intended use). 
A more viable approach can be applied if a nominal model a (0) is available. Note 
that this is conceptually different from the parametric calibration case, since it is 
not necessary to specify how the nominal model depends on the array parameters. 
The “true” array response is now expressed as 

a c = Q()? c )a(? 7 c ) (3.36) 

where Q(r/) = diag{q(r/)} is now a local (DOA-dependent) correction matrix. 
Using the calibration data, the correction q ( 77 ) can be determined at the DO As 

{ric)c=i as 


q07c) = Sc/a07c) (3-37) 

where ./ denotes element-wise division. The idea now is that the correction 
matrix will be a much smoother function of the DOA, provided that the nominal 
model captures most of the 6 dependence. Thus, linear or spline-based interpo¬ 
lation can be used to interpolate the real and imaginary parts of q(-) instead of 
directly interpolating the calibration vectors. Given an interpolated q (6) at a cer¬ 
tain desired DOA0, the array response is then computed as a (0) = diag{q(0) }a(0). 
For a given calibration grid, this results in a (much) more accurate interpolation. 
We close this section by noting that it is of course possible to interpolate gain 
and phase rather than real and imaginary parts, although this in general does not 
change the result significantly. We also note that the same interpolation can in 
principle be used to calibrate over a multidimensional parameter space, such as 
azimuth, elevation, and frequency dependence. However, this is considerably 
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more complicated, and may necessitate a more sparse calibration grid in order 
to keep the measurement costs manageable. 


3.6 ARRAY INTERPOLATION TECHNIQUES 

Interpolation of calibration data was introduced in Section [3. 5. 21 In this section 
we describe two more elaborate interpolation techniques. The first is based on a 
Fourier representation of the array response; the second uses a local polynomial 
approximation. Both methods assume the availability of calibration data in the 
form of Equation (13.28b . 


3.6.1 Fourier-Based Array Interpolation 

This approach is based on the so-called Manifold Separation theorem (Belloni 


et al.. l2007l : lDoron and DoronLll994at) . which says that any array can under mild 
conditions be represented by 


ac(?7) = Gv(?7) + r(f7) 


(3.38) 


where G is the MxM, M >M “sampling matrix,” and (assuming M to be odd) 

iT 


\(T]) = 


; M-L -M —3 ^ 
p-J — V p-J — V 1 pJV pJ — V 


(3.39) 


is a Fourier vector. Note that r\ in Equation (13.39b is the “physical” DOA, and 
should not be interpreted as the electrical angle as in Equation d3.12b . The error 
||r(^)|| is shown to vanish “rapidly” as M ^ oo. The representation (13.38b is a 
truncated Fourier series representation of the array response, which is always a 
27r-periodic function. In the antenna literature, the elements of the Fourier vector 
d3.39b are often termed phase modes. The idea behind the Fourier-based inter¬ 
polation approach is that most arrays can be well approximated by a relatively 
small number of phase modes, since the corresponding ter ms in the sampling 
matrix decay super-exponentially (under some assumptions: 


Doron and Doroi 


1994al) . Given enough calibration vectors A c , it is tempting to try to compute G 
by solving an LS problem similar to Equation d3.33b : 


G = arg min ||A c — GV (rj c 
G 


2 

F 


In practice this can be problematic due to th e potential il 


\{k] c ) and the large dynamic range in G. Instead. iBelloni et al 


(3.40) 

-condi tioning of 
J 2007 I 1 proposes 


to compute the sampling matrix in two steps. First, an M xC matrix Gc is 
computed, thus using a maximum number of phase modes equal to the number of 
calibration vectors C. If the calibration vectors have been collected on a uniform 

grid covering the whole range (0, 27 t), Gc can be computed by an inverse discrete 

/\ 

Fourier transform (IDFT) of A c . This can be done efficiently using inverse fast 
Fourier transform (IFFT), and there is no risk of error amplification since the DFT 
vectors are orthogonal. The second step is to truncate the sampling matrix Gc 
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to the desired size MxM, for example by selecting the M “middle” columns, 
which correspond to the lowest-order phase modes. Choosing the number of 
phase modes M is crucial and can be guided by the Euclidean norm of the 
columns. In f act, if the size of the error in the calibration vectors is known, 
Belloni et al.1 d2007l) suggest an “optimal” procedure for choosing M based on 


matching the approximation and the calibration errors. 

Besides the calibration aspect, an obvious advantage of Fourier-based inter¬ 
polation is that the Fourier vectors resemble the array response for a UFA. Thus, 
the plethora of computationally efficient methods for the UFA case are directly 
applicable once the sampling matrix has been found. The difference with the 
UFA case is that y(q) depends directly on r\ and not on (sin ij), as it would for 
a “physical” array. Since this does not change the “UFA structure” of the vir¬ 
tual response vector, one could ask if there are even better choices of nonlinear 
transformation g(rj) that could result in a better approxim ation for a given siz e 
M. An interesting contribution in this direction is given i mBuhren et al.l (120041) . 
although it is not directly applicable to the calibration case considered here. 


3.6.2 Interpolation Based on Local Polynomial 
Approximation 


The global diagonal calibration (13.351) and the linear interpolation of correction 
vectors (13.371) represent two extreme cases of interpolation. In the former, the 
correction is assumed to be globally valid; in the latter, it is only valid between 
two calibration points. If the “true” correction functions are smooth, but not 
necessarily constant, it is natural to apply a method that takes several, but not 
all, neighboring points into account in the interpolation. In the absence of more 
structural information, these should be weighted according to their distance to the 
DOA 0 of current interest. One way to achieve this smoothed interpolation is to 
model the correction vectors (13.371) using a set of basis functions {0/(v)}^ =o 
that are centered at the “query point” 0. The mth element of q(r] c ) is thus 
expressed as 


L 


q m (ric) = ^2ai<Pi(i1 c -9) = (p T (ric-0)a 


(3.41) 


1=0 


It is beneficial to model the real and imaginary parts of q m (jlc) separately, 
and thus we assume that q m (qc) is real-valued in Equation (13.41ft . Once the 
coefficients oq have been determined, the sought interpolator is obtained as 


q m (0) = (p i (0)a. The most commonly applied basis function is (pi(x)=x l , in 


which case the ai 

pproxin 

nation is called 1< 

ocal polynoi 

mial approxi 

□nation (FPA) 

(Fan and Giibelsl 

19961: 

Katkovnic et al.. 

2006 

). See 

Fanne et al. 

d2006l. 

2007). 


ibration. In Fan and Giibelsl ( 1996 ) it is argued that an odd polynomial order 


L should be chosen, and in array calibration applications we have mostly used 
L = 1 (i.e., a locally linear approximation). 
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The polynomial coefficients a are determined by a weighted regression 


c 


a = argminy~'w/,(|j? c — 0|) \q m (r) c ) ~4> T {r) c -0)a\ 


(3.42) 


c= 1 


Here Wh (v) represents a weighting function or smoothing kernel , which should 
be a decreasing function of v. How fast Wh(x) decays depends on the smoothing 
parameter (or bandwidth) h. Once the polynomial coefficients have been found 
from Equation (13.421) . the interpolated value is computed from the postulated 
model (13.411) . In the LPA case, the sought interpolation value is simply given as 
q m (0) =<5o- In fact, the higher-ord er coefficients give smo othed estimates of the 
derivatives: q' m (0) =a\ and so on ( Fan and Giibels . 1996b . 

2 2 

A common choice of weighting is the Gaussian kernel, w_h(x) oce~ x ! 2h , 
in which case the smoothing parameter h is interpreted as a standard deviation. 
The actual shape of the kernel may not be so critical, but the choice of h will 
directly influence the amount of smoothing that is applied. When /i —> 0 we 
expect the LPA to behave as a direct linear interpolation, whereas h -> oo means 
that a global model is used. By varying h we can thus “interpolate” between 
these extreme cases. The choice is a trade-off between bias and variance, and 
the optimal value depends on the reliability of the estimated correction values 
qmihc), as well as the smoothness of the true function q m {6). Since the latter 
is typically not known, the bandwidth h must in practice be determined from 
the available data only. A technique often applied in the statistical literature is 
cross-validation. The data {q m (hc)}c=i then split into an estimation set—for 
example, {q m (hi), qm(h3), • • •, qm(hc ~ 1)}—and a validation set—for example, 
{qmihl),qm (?74), • • • ,qm(he )} (it might be wise to exclude the end point from the 
validation data to avoid data extrapolation). Estimates of the validation data are 
computed using interpolation of the estimation data, and the mean-square error 
(MSE) as compared to the actual validation data is computed. The optimal value 
of h is then the one that minimizes this MSE. With small data samples, the roles 
of the estimation and the data set can be reversed to obtain a new estimate of the 
MSE, and h is then chosen to minimize the average between the two. 

One drawback with this approach is that the grid size in the estimation set is 
different from the one used in the original calibration data, and thus the calculated 
h may not be optimal when using the full data set. Another disadvantage is that 
it assumes that a single h can be used for the entire range of DOAs in the 
calibration data. Other ways to determine the bandwidth p a ramet e r, including 
adapti ve sc ale methods, are discu ssed in iFan and Giibelsl ( 

( 1998b . and Katkovnic et al. ( 2006b . 


1996b . Katkovnik 


It should be stressed that the basis function expansion (13.411) is only valid 
locally around the query point 0. Thus, a new set of coefficients a must be 
computed for each DO A 6 of interest. When evaluating, for example, the MUSIC 
spectrum at a dense set of DOAs, the LPA-based interpolation technique is thus 
computationally demanding, particularly when an optimal bandwidth h needs to 
be found for each real and imaginary part of each antenna element. However, if 
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sufficient memory is available, the bulk of the computations can be done offline 
by computing new ‘‘artificial” calibration vectors on a finer grid than the original 
measured data. A simpler method like linear interpolation can then be used on 
the new “cleaned” calibration data. As we will see in the next section, the local 
interpolation technique can potentially yield superior performance in difficult 
scenarios, although the choice of h is both critical and difficult. 


3.7 COMPARISON OF APPROACHES 


The purpose of this section is to compare the applicability of the various calibra¬ 
tion methods. This comparison is not always easy since the different approaches 
rely on their own assumptions. Table l3J~l classifies the techniques in terms of 
their dependence on a nominal model a ( 0 ) or a parametric model a (0, p) of the 
array response, as well as in terms of the need for calibration data. 

Another aspect of interest is the computational complexity of the different 
calibration methods. Without going into detail, we note that the auto-calibration 
approach is in general the most demanding from this point of view because of the 
many parameters it involves. Also, interpolation using LPA requires substantial 
computation, but this is less critical if the computations are done offline, as 


explained in Section l3.6.2 


3.7.1 Simulation Setup 

In the remainder of this section we present empirical results from computer 
simulations of the methods under various sources of error. The MUSIC algorithm 
(see Section l33l) is applied using different calibration schemes and, in the first 
case, auto-calibration techniques as well. All simulations use a ULA of M = 
10 half-wavelength (nominal) separation. Two equi-powered signal sources are 
used, with the second DO A fixed at 62 = 12° with respect to array broadside, and 
the first varying between 6 \ = 5° and 6 \ = 11°. The DOA estimation is free from 
finite sample errors; that is, an infinite number of samples (N —> 00 ) is assumed. 
This is implemented by generating the array covariance matrix according to 
Equation (13.6b . making the SNR irrelevant. The “true” array response used in 
the simulations is computed as 


a _ Q 1 ^ 2777 'Oi sin 6 »+ji cos6>) 


^2?rj(xM sin 0+yM cos (9) n 7" 


(3.43) 


f TABLE 3.1 Classification of Calibration Techniques 


Without Calibration Data 

With Calibration Data 

No model 

Nominal model 

Parametric model 

Robust estimation 

Auto-calibration 

Fourier-based interpolation 
Interpolation of correction q (rj) 

Parametric calibration 

v 
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Thus, k = 2n is assumed, which means that the element positions are normalized 
to the wavelength. 

The nominal array is an ideal ULA with half-wavelength element separa¬ 
tion, so x m = (m — l)/2 and y m = 0. In the model fEquation l3.43l) . all signals are 
assumed to reside in the xy-plane, and the DOA is taken clockwise relative to the 
y-axis. Different errors are applied to the gain matrix (nominally G = I) and the 
element positions. In all cases, calibration data is collected on a uniform grid from 
the DO As 7] i = —40° to r/c =40°. The grid size is varied between Arj = 1° and 
Ar/ = 20° , which means that the number of calibration vectors varies between 
C — 81 and C = 5. All of these cases can be considered as relatively sparse cali¬ 
bration, which keeps data collection costs low. This is particularly apparent when 
the calibration needs to be done over several parameters, such as azimuth, ele¬ 
vation, and frequency. In all simulations, a coherent calibration is assumed, and 
the calibration vectors are generated as 

a c = a c +a c (3.44) 

where a c is the true array response in the direction rj c and a c is the zero-mean 
complex Gaussian distributed with covariance matrix E[a c a* *] = cr^I. The stan¬ 
dard deviation is varied from cr a = 0.01 to cr a = 0.1. The former is a realistic 
measurement accuracy in an anechoic chamber, whereas the latter can account 
for perturbations in the operational data (e.g., due to temperature drift) that 
were not present during calibration data collection. In all figures that follow, 
the empirical RMS error (RMSE) is based on 500 independent realizations of 
the array perturbations for each point. Cases where an algorithm fails to resolve 
the sources (only one local extremum within ± A 0 of the true value) or where the 
DOA estimation error is larger than half the DOA separation, A0, are declared 
failures and are not included in the RMSE calculation. If the empirical failure 
rate exceeds 40%, the corresponding RMSE value is not included in the plot. 

3.7.2 DOA-Independent Modeling Errors 

In Example l3.1l we consider direction-independent errors only, meaning that the 
element positions are fixed at their nominal values. The calibration DO As are 
rj = [—40°, —35°,..., 40°]. That is, C = 17 calibration vectors are generated. The 
standard deviation of the calibration vectors is cr a = 0.01. 


Example 3.1 Channel Errors 

The first case considers channel errors only, so G = I + diag(g), where the 
elements of g are zero-mean i.i.d., circularly symmetric, and distributed as 
g m eAf(0,<7^), where <j g is varied from 0.02 to 0.22. The DOAs are fixed at 

0 = [9°, 12°] r , which corresponds to approximately half-beamwidth separation. 
We compare the following DOA estimation methods: 


• MUSIC with no calibration. 
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• Auto-calibration using the W-F approach. 

• Auto-calibration with the MAP method (13.261) . using a correct statistical 
model of the array perturbations. 

• MUSIC using linear interpolation of $i{q m (0)} and ${g m (0)}- 

• MUSIC with a global full-correction matrix Q, which is estimated using 
calibration data. 

• MUSIC with LPA-based interpolation of q(0), with L = 1 (locally linear 
model). A Gaussian window is used, where the window parameter h is 
optimized using cross-validation (CV) as described in Section U.6.21 This 
is termed LPA CV. 

• Similar to LPA CV, but optimal h is found using the true function qk(0). 
This “genie-aided” method, which is included as a reference, is called 
“LPA genie” in the plot. 

• MUSIC with a global Q = diag{q}, which is estimated using calibration 
data. Note that this is the “correct” model in this example. 

The W-F approach is included to illustrate the identifiability problem. With a 
nonlinear array, the method yields perfect gain and DO A estimates in all cases, 
since there are no finite sample errors. Note that the Fourier-based interpola¬ 
tion technique in Section 13.6.11 is omitted in this example. As will be seen in 
Example 13.31 the method is not applicable with such a sparse calibration grid 
(Ai7 = 5°). 

Figure |U3]shows the average RMSE of the different estimates of 0\ and 62 
versus the standard deviation of the gain errors. It is seen that the uncalibrated 
MUSIC estimates deteriorate significantly as the size of the channel errors 
increases. When the standard deviation of the gain errors is larger than 0.14, 
the failure rate exceeds 40%, so these RMSE values are not included in the 
plots. The W-F is unable to improve on the uncalibrated case, as expected. The 
MAP method shows surprisingly good performance, especially for small per¬ 
turbations. This clearly shows that a correct application of prior knowledge can 
mitigate the identifiability problem. However, MAP deteriorates rapidly as the 
size of the channel errors increases. 

In contrast, the methods that use calibration data are insensitive to the size 
of the channel perturbations. It is also seen that linear interpolation performance 
is the worst in this scenario because of the inability to take advantage of the 
smoothness in q(0) (which is in fact constant in this case). Also, estimating a 
full-correction matrix Q from the calibration data results in higher variance of 
the estimates due to the large number of parameters, although the “true” model 
Q = diag{q} is included as a special case. 

Using the correct model (global q) will of course result in the lowest DO A 
RMSE. The LPA interpolation reaches essentially the same RMSE if the optimal 
(which should be “large”) smoothing parameter h is found (LPA genie), but the 
estimation of h from data by cross-validation (LPA CV) results in a notable 
performance penalty. 
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Example 3.2 Mutual Coupling Errors 

In the second case we consider the effect of mutual coupling errors. Referring to 
Equation (13.43b . the gain matrix is generated as G = I + G, where the elements of 
G are random. The magnitude of the perturbations is scaled so that the diagonal 
elements are the largest and the error magnitude decreases across the subdiago¬ 
nals. The elements of G are generated as independent complex Gaussian random 
variables: gy eAf(0, The scaling is inspired by a mutual cou¬ 

pling model for a ULA of half-wavelength dipoles, the rationale being that the 
relative errors should be approximately constant. For simplicity, the nominal 
model Go = I is used in place of a more realistic coupling model, keeping in 
mind that a known coupling do es not affect the quality of the DOA estimates 
significantly (ISvantesson . 1998 ). In this case, the DOA of the first source is varied 
from $i=5° to 6 \ = 11°, whereas the second DOA is fixed at 62 = 12°. 

Figure 13.41 shows the empirical average RMSE of the MUSIC estimates 
of 0 \ and 62 using no calibration, global q calibration, linear interpolation of 
q(0), spline interpolation of q(0), FPA-based interpolation of q (0), and global 
Q calibration—the last being the “true” model. As expected, the global Q cali¬ 
bration performs best, whereas a diagonal q results in a substantial bias despite 
the errors to G being diagonal-dominant. The reference FPA genie achieves the 
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FIG U RE 3.4 Empirical RMSE of a DO A estimate versus DOA separation (mutual coupling errors 
only). 


same (optimal) performance as global Q, whereas the LPA CV and the linear 
interpolation approaches are comparable, and their RMSE is slightly higher than 
that of global Q and LPA genie. 


Examples ^. 1 l and lT2l illustrate the performance of the various methods under 
DOA-independent (i.e., global) array perturbations. In both cases, as expected, 
the method that uses the correct model performs best. However, it is also clear 
that an interpolation-based approach, which makes no assumptions regarding 
the error model, can give almost the same performance. If an optimal smoothing 
parameter can be found, the LPA approach achieves the lowest RMSE. How¬ 
ever, estimating h using C V results in a penalty, which means better methods are 
needed. We emphasize that the model-free interpolation must be applied to the 
correction data q(rj c ) = a c ./a(^ c ), not to the original calibration data a c directly. 
Although not shown here, the latter approach fails completely in the studied sce¬ 
narios because of the relative sparseness of the calibration data, which leads to too 
high a variability in the true function 2 L c (rj) between the sampling points {rj c }^ =] . 

3.7.3 DOA-Dependent Modeling Errors 

In the second case, both mutual coupling and element position errors are present. 
The gain matrix is generated as in Example 13.21 Referring to Equation (13.43b . 
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the x m and y m positions of the sensor elements are perturbed by i.i.d. N (0,0.05 2 ) 
random variables. The DO As are fixed at 0 = [9°, 12°] r . The examples illustrate 
the importance of calibration data quality. 


Example 3.3 Varying Calibration Grid 

In this example, the calibration grid is varied from Aij= 1° to Ar] = 20°. The 
standard deviation of the calibration vectors is fixed at o a =0.01. The uncal¬ 
ibrated MUSIC method fails completely in this scenario, as does the global q 
interpolation. The global Q approach yields reasonably good performance, but is 
omitted in the plots. Instead, we include results for manifold separation (Fourier- 
based interpolation). The method is implemented by solving the associated LS 
problem (13.401) . although it is indeed highly ill-conditioned since the calibration 
data is not available over the whole field of view. The dimension of the virtual 
array is chosen as M = 49, which is an overall optimal choice for the smallest cal- 

/V 

ibration grids A^e{l 0 ,2°,3°}. Once the truncated sampling matrix G (M x M) 
is computed, it is used together with the Fourier vector (13.391) to interpolate the 
array response model to any desired DO A 0. 

Figure l331 shows the empirical average RMSE versus the grid size. A number 
of effects can be seen. The first is that the amount of smoothing depends on 



FIGURE 3.5 Empirical RMSE of a DOA estimate versus calibration data grid size (position and 
mutual coupling errors only). 
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the complexity of the model—that is, the variability of the true function to be 
interpolated. A sparse grid size means high variability, which favors the most 
local interpolation method (simple linear or spline interpolation). When more 
calibration data is available, there is less variation between the samples and 
more smoothing is necessary. Consequently, the LPA approaches outperform the 
others for finer interpolation grids. The second observation is that determining 
the optimal smoothing parameter h is increasingly difficult when less calibration 
data is available. For dense sampling, the LPA CV approach performs close to 
the LPA genie design, whereas for sparse calibration data, the optimal h is only 
found by the genie-aided optimization procedure, which cannot be implemented 
in a practical situation. 

Figure l331 also illustrates that the manifold separation approach suffers from 
not taking advantage of the known (although poor) nominal response model. 
Thus, the method requires more calibration data and is useful only for At? < 3° in 
the chosen scenario. The other interpolation techniques also perform acceptably 
for A 77 = 20° or more, but this is only possible because of the known nominal 
array response model. Finally, there is no systematic difference between linear 
and spline interpolation in this case. 


Example 3.4 Varying Calibration Error Size 

In this example the calibration grid is fixed at At? = 5°, whereas the standard 
deviation of the calibration vectors is varied from o a — 0.01 to o a = 0.1. The rest 
of the scenario is the same as in Example 1 3.3 1 Figure lL6l displays the empirical 
average RMSE versus cr a . It also illustrates how the quality of the calibration 
data affects the optimal amount of smoothing in the interpolation. For increasing 
variance of the calibration vectors, the LPA approach increasingly outperforms 
the simple interpolation approaches. It is also interesting that linear interpola¬ 
tion gives consistently better results than spline interpolation when calibration 
errors increase. One possible explanation may be that spline interpolation fits a 
slightly more complicated model to nonperfect data and thus suffers more from 
overfitting when the calibration measurements are noisy. 


Examples l 3.3 l and lL4l sho w that array interpolation using a nominal model can 
give highly accurate DOA estimates even if the calibration data is imperfect. Both 
cases illustrate the trade-offs involved in choosing the amount of smoothing to be 
applied. If the calibration data is accurate and the true function is highly variable 
(sparse sampling), a simple linear interpolation is a good choice. However, as 
the calibration data deteriorates or as the true function becomes smoother, a local 
interpolation technique like LPA can give superior performance, provided a good 
choice of smoothing parameter h is found. 
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Standard Deviation of Calibration Vectors 


FIGURE 3.6 Empirical RMSE of a DOA estimate versus the standard deviation of calibration 
vectors (position and mutual coupling errors only). 


3.8 CONCLUSION 

This chapter presented methods for software calibration (or model update) 
applied to high-resolution DOA estimation. The methods are divided into 
categories depending on the involved assumptions. Auto-calibration techniques 
do not require any special calibration data, but a detailed error model must be 
specified. If the error model has many unknown parameters, the joint estima¬ 
tion of DOA and array parameters is likely to be ill-conditioned, leading to poor 
estimates or even nonuniqueness. The situation can to some extent be mitigated 
if prior information regarding the array parameters is available, in the form of 
nominal values and standard deviations. For large errors or in the absence of a 
specific error model, it is necessary to collect calibration data using sources at 
known positions. This is in fact a standard procedure before operational use of an 
antenna array. If a model is known, estimation of error parameters is straightfor¬ 
ward when calibration data is available. A more realistic case is when a nominal 
model is available but the precise effect of the various error sources cannot be 
accurately quantified. 

We demonstrated that interpolation of a correction factor for each sensor 
is viable for this situation. Such as approach is model based in that it exploits 
information regarding the nominal array but assumes no specific form for the 
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correction. The two extreme cases of interpolation are simple linear interpola¬ 
tion, which uses no smoothing, and estimation of a global interpolation factor, 
which results in maximum smoothing. Local polynomial approximation (LPA) 
was introduced as a technique for “interpolating” between these extreme cases. 
The amount of smoothing is automatically adapted to the variability of the true 
function using cross-validation. Although this gives satisfactory results in most 
scenarios, performance is still limited by the choice of smoothing parameter h. 
Thus, more investigations regarding this aspect should be carried out, in par¬ 
ticular trying “locally adaptive” methods where h is allowed to depend on the 
direction of arrival. 

A final class of techniques discussed was Fourier-based array interpolation. 
The advantages of this approach are that no information regarding the array 
response is necessary, and that the Fourier-based array model allows compu¬ 
tationally efficient DO A estimation. The drawback is that a relatively dense 
calibration grid is necessary to yield satisfactory results, to allow estimation of 
a sufficient number of Fourier coefficients (or phase modes). 

The different calibration techniques were presented in their basic form, 
involving a single-direction parameter only and a scalar “wavefield.” If a model 
can be specified, the extension of auto-calibration and parametric calibration 
techniques to multiple parameters per sources and/or polarization-sensitive 
antennas can be conceptually simple. However, practical applicability is lim¬ 
ited by the increasing number of parameters. Also, interpolation of a correction 
factor is straightforward in principle, but computational complexity significantly 
increases. 

It should be not ed that one of the mos t investigated applications of LPA 
is image smoothing (iKatkovnic et al. . 2006 ). co mmonly referred to as moving 
least squares in the image-processing community ( Bose andAhuia . 20061) . Image 
smoothing is a two-dimensional (2D) problem, corresponding to, for example, 
calibration over azimuth and elevation or azimuth and frequency. The increased 
complexity makes the search for computationally more efficient alternatives of 
great interest. An interesting alternative is manifold separation, if the imple¬ 
mentation can be based on 2D FFT. The extension of the presented methods to 
a diversely polarized array, where the two polarization components (e.g., hor¬ 
izontal and vertical) can be measured separately, is straightforward, since the 
components act as separate arrays. Although it is more complicated to apply the 
techniques to circularly polarized antenna elements receiving signals of different 
polarizations, this is an interesting extension. 

Besides those just mentioned, many other questions related to array calibra¬ 
tion deserve further study. One interesting direction is to combine parametric 
and nonparametric techniques. A simple, but not necessarily correct, parametric 
model can capture most of the variability in the array model. What is left can then 
be corrected for using nonparametric interpolation. It is an open issue whether 
or not this would improve practical performance. 
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Regarding local polynomial approximation, one could question if choices 
of basis function other than polynomials can give better performance. In fact, 
using Fourier-based vectors would create a link between this approach and the 
Fourier-based array interpolation in Section [3.5.11 This combination could lead 
to a local Fourier-based model with potentially better performance, although at 
the expense of greater computational cost. 

Finally, there is a need to develop tools for predicting performance of the 
different approaches, in terms of a statistical analysis for a given (known) model 
and methods for predict ing performance using availa ble data only. Robust testing 
based on the Bootstrap dZoubir and Boashashlll998l) . which is closely related to 
cross-validation, is a natural choice for this task. 
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Narrowband and Wideband 
DOA Estimation for Uniform 
and Nonuniform Linear Arrays 


T. Engin Tuncer, T. Kaya Yasar, Benjamin Friedlander 


4.1 INTRODUCTION 


Direction-of-arrival (DOA) estimation for narrowband signals is well known 
in the literature. Recently, interest in nonuniform linear arrays has grown since 
they require fewer sensor and receiver structures and their performance can be 
increased via techniques such as array mapping. It is known that the DOA perfor¬ 
mance of nonuniform li near arrays can be imp roved when the data for the missing 
sensors is compensated llTunceret al. . 2007 ah . Array mapping is an effective tech¬ 
nique for this purpose, since it can improve the computational complexity and 
performance. In this chapter, different array-mapping techniques are considered 
and examples of DOA performance are outlined for the narrowband case. 

Wideband direction of arrival has been explored for many years. Some signals 
are wideband, and so significant processing gain through wideband processing 
can be obtained. In wideband processing, unlike narrowband processing, several 
frequency bins carry different information regarding source DOA angles. This 
generates diversity and processing gain for DOA estimation. In this chapter, both 
well-known and new approaches are considered for wideband DOA estimation. 

The main problem in wideband DOA estimation is how the information at 
each frequency can be combined to obtain the most accurate DOA information. 
Two main types of wideband processing are coherent and noncoherent process¬ 
ing. In coherent wideband processing, the covariance matrix at each frequency 
is combined coherently using a transformation and the final covariance matrix is 
used for DOA estimation. In noncoherent wideband processing, the DOA at each 
frequency is estimated separately and the results are combined noncoherently. 


Classical and Modern Direction-of-Arrival Estimation 

Copyright © 2009 by Academic Press, Inc. All rights of reproduction in any form reserved. 


125 












126 


Narrowband and Wideband DOA Estimation 


While several coherent wideband processing methods are discussed in 
the literature, the main philosophy behind them is the same: They all use a 
transformation matrix that maps the covariance matrices at different frequen¬ 
cies to a covariance matrix at the center frequency. The main differences among 
these methods involve the transformation matrix . The aforementioned ph ilos¬ 
ophy for wideband processing was established in Hung and Kavehl (1988) and 
Wang and Kavehl ( 1985 . 1987 ) and later named the coherent signal-subspace 


method (CSM). Th e transformation matrix is desired to be unitary in order to 
avoid focusing loss (IHung and Kavehl 1 19881) —that is, the sig nal-to-noise (SNR) 
loss fo r the transformed system. The mapping matrices in IHung and Kaveh 
(119881) belong to the family of rotational signal subspaces (RSSs), and they do 
not have focusing loss. 

The performance of the transformation matrix designed by the RSS technique 
is very good as long as the angular sector for direction of arrivals (DO As) is small. 
The method has significant bias as the angular sector for the array mapping is 
increased. Another limitation of RSS is that it is configured for the design of 
square transformation matrices where the number of sensors or frequency bins 
for both the domain and range spaces is the same. In this chapter, we show that 
this limitation can be easily circumvented, allowing RSS to be used for the design 
of rectangular transformation matrices. This corresponds to generating different 
numbers of sensors or frequency bins for real and virtual arrays. The design of 
the square tr ansformation matrix is further developed in iKrolik and Swingler 
(1989, 1990l) by considering, respectively, the steered covariance matrix and 


resa mpling in the spatial doma in. 

In Doron and Weissl ( 19921) . a general class of transformation, namely signal 
subspace transformation (SST), is discussed and several properties are outlined. 
More specifically, the set of transformation matrices for SST covers nonunitary 
matrices that d o not lead to fo cusing loss. A short review of these methods 
can be found inlSellonel (120061). Another approach for array mapping i s array 
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Weiss ( 

1993). several interesting characteristics of coherent wideband processing 


using array interpolation are outlined. Weighted average of the signal subspace 


(WAVES) (lElio et all 120011) is proposed to deal with various error sources. A 


B ayesian approach for the joint estimation of model order and DOA is presented 


in 


Ng et al 


(120051) 


Recently, the Wiener array interpolation technique, where the source and 


noise oowers are 

estimated usii 

ig a maximum 
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), was 


noise power estimates makes the Wiener formulation a practically attractive 
approach. It was shown that its power estimation is accurate and that its perfor¬ 
mance is better than that of classical array interpolation, especially at low SNR 
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( Kava Yasar and Tunced. 120081) . Another advantage is its improvement of the 
condition number of the inverse matrix for the array interpolation matrix. It 
also allows more sensors in the virtual array than in the real array. This feature 
of Wiener array interpolation is especially useful when fast algorithms like 
Root-MUSIC are used for nonuniform linear arrays. 

Noncoherent wi deband DOA estimatio n methods were developed before 
coherent techniques (IWax et al.l.ll982ll 19841) . While noncoherent wideband pro¬ 
cessing is a simple extension of narrowband processing, it is effective for DOA 
estimation. Its main problem is the computational complexity, since an eigende- 
composition is needed for each frequency bin. This problem is worsened by the 
fact that noncoherent techniques require a search in the dense grid in order to 
avoid local minima. Computational complexity increases linearly as the number 
of frequenc y bins increases. Orthogonality of projected subspaces (TOPS) is 
presented in i Yoon et al.1 d2006l) as a way to achieve good performance at medium 
SNR levels. It is essentially a noncoherent method. 

In this chapter, we focus on different array-mapping techniques for coherent 
wideband DOA estimation. Narrowband and wideband techniques are compared, 
and the advantages of both are outlined. One of the main problems in DOA esti¬ 
mation is the correlation of source signals. When the source signals are correlated, 
the algorithms’ DOA performance degrades. The worst case is obtained when 
the source signals are coherent. For coherent sources, the CLEAN algor ithm is 
adopt ed to improve performance. CLEAN is w ell known in astronomy ( ClarkL 
1980h and radar signal processing ( Deng . 2004 ) for obtaining clean images from 


those corrupted by multiple sources. 

The advantages of the CLEAN algorithm are not w ell known in DOA estima - 
tion except for some limited research in beamforming dStoica and Mosesll2005l) . 
The main reason is that CLEAN produces only marginal improvement for well- 
separated uncorrelated multiple sources. For correlated sources, its advantage 
becomes more obvious. The difference is more significant for coherent sources. In 
this chapter, CLEAN is adopted for both narrowband and wideband processing. 
Several simulations are done to show its performance in a variety of scenarios. 


4.2 ARRAY MODELS 

Linear array geometry can be constructed by using uniformly or nonuniformly 
spaced array elements. Nonuniform linear sensor arrays have variants depending 
on their covariance matrix lags. The following section describes the details of 
these arrays. 

4.2.1 Uniform Linear Arrays 

Sensors are placed uniformly on a straight line for a uniform linear array (ULA). 
The inter-element distance is usually less than a half-wavelength to avoid spatial 
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ambiguity. Uniform linear arrays are one of the simplest forms of sensor arrays, 
and they ha ve several useful proper ties. For example, forward-backward spatial 


smoothing (IPillai and Kwon . 


19891) can only be applied to ULAs because of the 


Vandermonde structure of the array steering matrix. Furthermore, fast subspace 
algorithms like Root-MUSIC can be used in ULA, which increases computa¬ 
tional efficiency. Usually it is desirable to increase the array aperture in order to 
improve resolution. This can be achieved by increasing the number of sensors 
when the inter-element distance is kept fixed for the ULA. However, adding 
elements is expensive and it increases system as well as computational com¬ 
plexity. One remedy for this problem is to keep the number of sensors the same 
and increase the distance between them. Thus, we obtain nonuniform linear 
arrays. 


4.2.2 Nonuniform Linear Arrays 

In nonuniform linear arrays (NLAs), sensors are usually placed at integer mul¬ 
tiples of a unit distance d <X/ 2, where A is the wavelength. Assuming that 
the first sensor is the reference and is at the origin, the sensor displacements are 
cLnla = [0 d 2 ... dM 1 for an NLA with M senso rs. The treatment of the noninteger 
case can be found in Abramovich et al. ( 2000l) . 

Since several sensors are placed with inter-element distances larger than a 
half-wavelength, there is a possibility of spatial ambiguity, where two different 
DOA angles generate the same covariance matrix, making it impossible to iden¬ 
tify the true source DOA. The possibility of ambiguous DOA measurement is 
usually low as long as there is at least one pair of sensors with an inter-element 
distance of less than a half-wavelength. 

Nonuniform linear arrays try to cover a large array aperture with a limited 
number of sensors. Therefore, efficient use of sensors is the priority. Since NLAs 
cover an aperture with fewer sensors, their performance is worse than that of 
ULAs with the same aperture. However, their performance is much better than 
ULAs with the same number of sensors. The target in DOA estimation with 
an NLA is to obtain the performance and advantages of the equivalent-aperture 
ULA. Root-MUSIC can still be applied, but its performance is usually worse than 
Spectral-MUSIC because missing sensors lead to large errors in polynomial root 
finding. 

Co-array is used to characterize array structures. It can be defined as a sym¬ 
metric function that represents the number of times each spatial correlation lag 
is contained for a given array structure. If we define h as the vector or sequence 
where the existence of a sensor at a certain position is denoted as one or zero, 
the co-array is obtained as the result of the following convolution operation: 


c[n] =h[n] *h[—n] 


(4.1) 


For example, consider a 3-element NLA where d^LA = [0 1 3], h is given as 
h=[l 1 0 1], andc = [l 113 11 1]. It does not have redundancy for the 
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positive and negative covariance lags and has three zero lags. Arrays that have 
no redundancy and missing covariance lags are called perfect arrays. 

There are different types of NLA, some of which are described next. 


Nonredundant NLAs 

Nonredundant arrays have co-arrays that are either zero or one except at the 
origin. In other words, there is only one element for any covariance lag. Certain 
covariance lags may not be present for these arrays and co-arrays may contain 
zeros. Table H~T1 presents a short list of nonredundant arrays. Those with M < 4 
are called perfect arrays with full co-arrays. A more complete list can be found 

d 19991) . 


in Abramovich et al. 


Nonredundant arrays use sensors very effectively, covering a large aperture. 
However, while this increases array resolution, it comes with a price. The array 
covariance matrix does not carry the same information as the equivalent ULA 
with the same aperture. Therefore, DOA performance of nonredundant arrays is 
worse than that of the same-aperture ULA. The missing sensors cause problems 
especially when the source signals are correlated or coherent. 


Minimum Redundant NLAs 

Minimum redundant arrays have no zeros or holes for their co-arrays. They have 
all the covariance lags, some of which may repeat. Therefore, a covariance lag 
may be represented by the correlation between more than one pair of sensors. 
Minimum redundant arrays have the largest aperture, although they have no gap 
in their co-arrays. Table l4~2l presents a short list of minimum redundant arrays. 


TABLE 4.1 

Nonredundant Arrays 

M 

Sensor Separation 

2 

1 

3 

1 2 

4 

1 3 2 

5 

13 5 2 

6 

1 3 6 2 5 

7 

136852 

8 

13567102 

9 

1 4 7 13 2 8 6 3 

10 

1 5 4 13 3 8 7 12 2 
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TABLE 4.2 Minimum Redundant Arrays 



M 


2 


3 


4 


5 


6 


7 


8 



Sensor Separation 


1 3 2 


1332,3411 

131 62,15322,1 1 4 4 3 

1 3623 2, 1 1 4443,1 1 1 55 4, 1 1 642 3, 1 73222 
1366232,1194332 



A mo 


( 120021 ) 


e complete list can be found in iLinebarger et al.l (119931) and lYan Trees 


The covariance matrix of a minimum redundant array can be completed to 
make it similar to that of a ULA with the same aperture. Since all covariance 
lags are present, the covariance matr ix can be completed effectively. There are 
different approaches for this purpose dAbramovich et all 1 9981 ; IPillai and Haben 


19871) . It turns out that it is possible to improve the DOA accuracy of minimum 


redundant arrays by covariance matrix completion. 


Partly Filled NLAs 

Partly filled NLAs have a uniform linear array portion in their array structure, 


making them a combination of ULAs a 
tages, especiallv for coherent sources ( 

md NLAs. Th 

is gives them certain advan- 

Tuncer et al.. 

2007a). Forward-backward 

spatial smoothing (IPillai and Kwon.ll98^ 

) can be used for the ULA part to obtain 
d by subsequent processing. An exam- 
2 3 4 8 10 11]. The ULA part can 

an initial estimate, which can be improve 
pie of a partly filled NLA is cInla = [0 1 


be at the beginning or end or in the middle. 

A partly filled NLA has many repeating covariance lags as opposed to min¬ 
imum redundant or nonredundant arrays. This is an inefficient use of sensor 
elements, but it is advantageous for obtaining more accurate covariance matri¬ 
ces. In this respect, partly filled NLAs fill the gap between ULAs and NLAs. 
Furthermore, they can be used for coherent sources effectively since an initial 
DOA estimate can be obtained from the ULA part with forward-backward spa¬ 
tial smoothing. Sometimes partly filled and minimum redundant NLAs overlap. 
Examples of such arrays are presented in Table 14. 3 1 

Another classification divides arrays as fully augmentable and partially 
augmentable. In fully augmentable arrays, all the covariance lags exist and 
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TABLE 4.3 Partly Filled and Minimum 
Redundant Arrays 


M 


Sensor Separation 


7 


111554 


12 


1 1 1 (20) 5 4 4 4 4 3 3 


13 


1 1 1 (24) 54444433 



) 


their co-arrays do not have missing lags. Minimum redundant arrays are fully 
augmentable. Partially augmentable arrays have certain missing lags, and their 
co-arrays have holes. Nonredundant arrays are partially augmentable if the 
number of sensors is larger than four. 


4.3 NARROWBAND DIRECTION-OF-ARRIVAL 
ESTIMATION 

A narrowband model for array signal processing is the most frequently used 
framework because of its relative computational efficiency and simplicity. The 
following section describes this model and the DO A estimation techniques. 


4.3.1 Narrowband Signal Model 

In the narrowband model, it is assumed that the time-bandwidth product is small: 

Br «1 (4.2) 


where r is the time that the wave propagates the array; n plane waves (or sources) 
impinge on a sensor array of M elements. The sensor array output, y r (t ), is an 
M x 1 complex vector given as 

y r (f) = A r (0)s(f)+v(O (4.3) 


where the signal, s(t), and the noise, v(t), are uncorrelated. The following 
assump tions are required for subspace algorithms like MUSIC ( Stoica and 
Nehorai, 1998b . 


Assumption 1. The number of sensors is greater than the number of sources 
(i.e., M > n ), and the steering vectors in A r (0) are linearly independent. 

Assumption 2. Noise is zero mean and is both spatially and temporally white 
(i.e., £{v(0} = 0 and E{v(t)v H (t)} = cr^I). 

Assumption 3. The number of snapshots is greater than the number of sen¬ 
sors, N >M , and source signals are zero mean with a positive semi-definite 
covariance matrix, R s =E{s(t)s H (t)} > 0. 
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In addition, sources are assumed to generate far-field planar waves in an isotropic 
and nondispersive medium. The sensors are identical, perfectly calibrated with 
no gain and phase mismatch as well as no mutual coupling, 
s (t) is a vector of source signals 

S(0 = l>l(0.---.Sn0)] 7 ’ (4-4) 

It is assumed that A r (0) is the Mxn steering matrix of the real array, which 
here corresponds to the physical array. When the virtual array is obtained by 
array interpolation, the steering matrix is denoted as A v (0). A r (0) is composed 
of steering vectors for each source: 

A r (0) = [a(0i)... a(0 w )] (4.5) 


The steering vectors are given as 


a(0/) = 


2 n 

1 exp \j—d 2 cos(0i) 


2 n 

exp \j—d M cos(6i) 


-iT 


(4.6) 


There are N snapshots or observations for t = 1,2,..., N. The covariance matrix 
for the array output is 

E {y r (Oyf (0} = R = A r (0)R s A^(0) + a 2 \ (4.7) 

R s ±E{s(t)s H (t)} (4.8) 

The sample covariance matrix is the maximum-likelihood estimate of the 
theoretical covariance matrix and is given as 


t= 1 


(0 


(4.9) 


It is a Gram matrix and is always positive semi-definite (IHorn and Johnsonl . 
19851) . In practice, it becomes positive definite when the contribution of noise is 
considered. 


4.3.2 Array Mapping for Narrowband Signals 

Array mapping is an effective technique for both narrowband and wideband DOA 
estimation. It allows us to map an NLA to a virtual ULA with the same aperture, 
and it can be used to cohere covariance matrices for different frequencies. One 
strategy for DOA estimation is to first obtain an initial DOA estimate and then 
improve it iteratively. This strategy is sector independent and improves the accu¬ 
racy of array mapping. Mapping accuracy, as well as the final DOA estimate, 
depends on the initial DOA estimate. 

For coherent source signals the initial DOA estimation is an important 
problem for NLA. Different methods can be used to solve it. In this chapter, 
Toeplitz completion, the forward-backward approach, and forward-backward 
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spatial smoothing are considered. It is also possible to use some special NLA 
structures to deal with multipath signals. The use of partly filled NLAs is pro¬ 
mising since forward-backward spatial smoothing can be employed for their 
ULA part. Once an initial estimate is available, iterations improve DOA accu¬ 
racy. Iterations do not monotonically decrease the error function, but only few 
iterations can lead to significant gains. 

Different array-mapp ing techniques exist in the literature. One example, 
Davies’s transformation ( Daviesl 1965 ). can be applied in a variety of cases, 
but it is s ensitive to the array geometry and selection of parameters. Array inter¬ 
polation ( BronezLfl988 ) is one of the most effective methods of array mapping. 
It is similar to simple sample interpolation in digital signal processing. The main 
difference between the two is the fact that array interpolation is model based. In 
fact, this is the power behind it. 

In this section, classical array interpolation, Wiener array interpolation, 
and the coherent signal-subspace method with the rotational signal-subspace 
technique are considered. Furthermore, the initial DOA estimation for mul¬ 
tipath signals is investigated using the forward-backward approach, Toeplitz 
completion, and forward-backward spatial smoothing. 


Classical Array Interpolation 


Arrav interpo 

ation was ir 

Produced and developed in 

Bronez ( 

1988). Doron and 

Weiss (1992). 

Doron et al. 

( 19931). Friedlanderl( 19931). andlFriedlander and Weiss 


( 19921) . It allows us to obtain the sensor data in a virtual array to some accuracy 
depending on the geometries of the virtual and the real sensor arrays. In classical 
array interpolation, all sources of interest are assumed to be present within a 
certain angular sector. That is, DOA angle 0 belongs to a sector [Ob, Of], which 
is uniformly divided by 80 intervals and its calibration angles are generated as 
0 = 0b~\~ iSO, where i = 0,1,..., (Of — Ob)/80. The calibration steering matrix for 
the real A r (0) and virtual A v (0) arrays are constructed using these calibration 
angles, and the classical array interpolation matrix, T, is obtained as 


T = A v (0) Af (0) (A,. (0) A? (0)) ■- :1 


(4.10) 


Classical array interpolation (CAI) is an effective method, but it has certain 
limitations. The accuracy of the mapping matrix T increases as the interpolation 
sector decreases. However, the sector should be kept sufficiently large to cover 
a large region of interest. Any source outside the interpolation s ector is not 
guaran teed to be identified correctly. The multiple sector approach in lFriedlander 
(1 1993 ) tries to solve this problem. The bias and mean-square er ror (MSE) fo 
the array interpolation can also be minimized to a certain extent (iHvberg et al. 


20051) . Another problem in classical array interpolation is the condition number 
of the matrix A r (0) (0 ). This matrix ma y be ill-conditioned; certain techniques 


are used to improve its condition number jFriedlanderi 1 1993 ) 
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Wiener Array Interpolation 

Wiener array interpolation (WAI) was proposed in iTuncer et al.1 (I2007a.bl) as a 
way to overcome some of the problems of CAL WAI is the approximation of 
the MSE optimum solution since the calibration angles are used to construct the 
array steering matrix. Its advantages include that it is more effective than CAI 
at low SNR, the condition number of the inverse matrix in the transformation 
matrix is significantly better, and it is possible to select the number of sensors 
in the virtual array to be larger than the number of sensors in the real array. 
Also, WAI performs better than CAI in wideband DOA estimation, especially 
when on 
lation in 

known, which is unrealistic for a practical application. Fortunately, it is possible 
to accurately estimate the signal and noise powers with a maximum likelihood 
approach. 

WAI can be carried out by considering the real array output, y r (t), and the 
desired or virtual array output, y v (0- It is assumed that there are M v sensors in 
the virtual array and that the array geometry is known. The error between these 
outputs is 


y a small number of fr equency bins are available. The Wiener formu- 


Tuncer et al.l (I20CT a.br) assumes that the signal and source powers are 


e = y v — Ty r (t) (4.11) 

The MSE optimum solution for M v x M matrix T is 

T = A v ,R,Af (A r R,Af+ R V ) _1 (4.12) 

If we assume R v = a v 2 I and K s = cr s I for uncorrelated source signals, we have 

T = o-;A v Af (or 2 A r A" + ct 2 I) _1 (4.13) 

The noise component in (a 2 A r + a 2 l) 1 improves the condition number 
and allows more sensors in the virtual array than in the real array. In order to 
have a sector-independent array interpolation, we need an initial estimate of the 
source angles. It is possible to obtain initial estimates by Toeplitz completion 
in the case of NLA. Assuming that a good initial estimate is available, DOA 
accuracy can be improved significantly by subsequent processing. 

A A /V 

Once we have an initial DOA estimate, 0 = [0\, ..., 0 n \, we construct T 

/V 

by considering narrow sectors for each 0j. These small sectors are divided 
with calibration angles Oij e [0; — 0 e ,0i + 0 € ] and Ofj = Q[ — 0 € + j86 , where j = 
0,1,..., 2 0 e /80. A v and A r are constructed from the calibration angles 6q. For 
simplicity, 6 is dropped from the steering matrix expressions. 

To apply the WAI as in Equation (14.13b . we need to find the signal and noise 
powers. These parameters can be found using the stochastic maximum likelihood 
estimation. Noise variance is estimated by projecting the covariance matrix to a 
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space orthogonal to the signal space. The orthogonal projection matrix, Ph, is 

A h = (AfA r )“ 1 Af (4.14) 

P h = I-A r A h (4.15) 


Then the noise variance is found by projecting the covariance matrix as 



1 


M — n 


tr(P h R) 


(4-16) 


Once the noise variance is found, the signal covariance matrix, Psml, can be 
obtained by first removing the noise contribution and then using the following 
expression: 

PsML=A h (R-a v 2 I)A h " (4.17) 


In general, signal variance can be different for different sources. The source 
signal variances can be obtained as 

diag(a 2 , ct 2 2 ,..., ct 2 (i ) = diag(P S ML) (4.18) 

The SNR for each source can be obtained as 


SNR,- = 10 log 10(er 2 /er 2 ), i=l,...,n (4.19) 

WAI can be carried out in the following steps: 

/V 

Step 1. Given the sensor outputs, y r , obtain the sample covariance matrix R 
from Equation (14.91) . Use Toeplitz completion for NLA. Fill the missing covari¬ 
ance lags using spline interpolation if necessary, and find the complete Hermitian 
symmetric Toeplitz covariance matrix, R tc- 

Step 2. Use R tc in the Root-MUSIC algorithm and find the initial DO A 

/V 

estimate 0. 

Step 3. Given 0 , construct T from Equation (14.13 ). 

Step 4. Find Rf = TRT H and use Root-MUSIC to find the true DOA 
estimate. 

Step 5. Iterate steps 3 and 4 by updating the initial estimate. 

It turns out that iterations can significantly improve DOA accuracy, with a price 
of increased computational complexity. 


Rotational Signal Subspace Method 

CSM is proposed for wideband DOA estimation (IWang and KavehUl985l) . Its 


nsform ation matrix, where the RSS tech- 
1988h for this purpose. The main idea is 


application requires the design of a tre 
nique is presented i mHung and Kavehl ( 
to keep the same SNR after the application of the transformation matrix. This is 
also called focusing design, and the resulting mapping is the focusing matrix. The 
focusing matrix can easily be obtained with the desired property if the mapping 
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matrix is constrained to be unitary. The advantages of such a design are twofold. 
As already mentioned, there is no SNR loss after the transformation; nor is the 
noise covariance matrix changed. One important assumption for subspace tech¬ 
niques such as MUSIC is that noise should be white or at least approximately 
so. Unitary transformation does not change the noise statistics, and MUSIC is 
as effective as the original data. In RSS, the transformation matrix is designed 
as the solution to 


T = argmin | | TA r — A 

T 


v 


,2 

I F’ 


s.t. T h T = Im 


(4.20) 


where M xK matrix A r stands for the real array steering matrix and M v xK 
matrix A v is the array steering matrix for the virtual array. It is assumed that K 
calibration angles are used in the neighborhood of the initial DOA estimates and 
that M V =M. 


The solution to M x M matrix T is given in lHung and Kavehl (119881) as 


T = VU 


H 


(4.21) 


where MxM unitary matrices V and U are obtained from the singular value 
decomposition, 


A r Af = U £V h (4.22) 

When the number of sources is less than the number of sensors, the solution is 
not unique. However, this is not an important problem and the performance of 
the transformation matrix is good. 


Generalized RSS Method 


In lHung and Kavehl (119881) . the RSS method is presented for the case where the 
number of sensors in real and virtual arrays is equal. It is possible to generalize the 
RSS method (generalized RSS, GRSS) for the case where the number of sensors 
in a virtual array is larger than the number of real sensors (i.e., M v >M). This 
can be easily achieved with a small modification of the transformation matrix T. 
In this case, the solution to M v x M matrix T is 


t=v£u h 


(4.23) 


^ ryr 

where M v x M matrix X = [I MxM 0] , M v x M v , and MxM unitary matrices V 
and U are obtained from the singular value decomposition, 

A r Af = U£V h (4.24) 


Note that the formulation in Equation ( 14.2311 can be used to map arrays with 
My > M as well as M v = M . T sat isfies all the desired properties of the focusing 
matrices dHung and Kavehl 1 19881) . 
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4.3.3 Initial DOA Estimation for Coherent Sources 

Source signals are coherent for the multipath reflections. The DOA estimation 
becomes problematic for coherent sources, and preprocessing is usually needed. 
When the source signals are coherent, the covariance matrix becomes rank defi¬ 
cient and it is not possible to find the source DOA. Certain methods can be 
used at the preprocessing stage to improve the rank, but their performance and 
computational complexity vary. These methods can be applied to uniform linear 
arrays; however, array mapping can be employed for NLA to obtain an equiva¬ 
lent virtual ULA to take advantage of them. In this section, forward-backward, 
Toeplitz completion, and forward-backward spatial smoothing techniques are 
considered. 


Forward-Backward Approach 

Forward-backward approach (FBA) is a simple and efficient technique used in 
a variety of applications, including linear prediction. The modified covariance 
matrix is obtained from the sample covariance matrix as 


^ 1 A A rri 

R=-(R + JR r J) 
where J is an off-diagonal matrix, 



1 

0 


(4.25) 


(4.26) 


R is a persymmetric matrix like the theoretical covariance matrix, which is also 
Toeplitz. 

FBA performance is good for coherent sources, but DOA-dependent errors 
are observed in general. Furthermore, this technique is usually applicable when 
the number of sources is less than three, n < 3. With several sources, FBA is not 
effective. 


Toeplitz Completion 

Toeplitz completion 
matrix of NLA. In 


(TC) is a well-known m ethod for improving the covariance 
Pillai and Haber ( 19871) . it is shown that TC generates an 


My x M v covariance matrix for an M < M v -element minimum redundant array, 
which is identical to the covariance matrix of a ULA of M v elements when 
true statistics are available. If there is only a sample covariance matrix, TC 
generates a matrix that results in some degradations in DOA performance ( Pillai 
and Haber, [l987|). I mPillai and Haben (119871) the covariance matrix obtained by 
th e TC is shown to be nonp osi tive semi-definite. This po int is also expressed 
in Abramovich et al. ( 19981) . In Abramovich et al. ( 1998 ). it is shown that the 
covariance matrix obtained by TC performs well for low SNR and can be used as 
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a good initial estimate for high SNR for uncorrelated sources. TC performance 
for coherent sources is not considered in the literature. 

Toeplitz completion is a simple and computationally efficient way to deal 
with coherent sources, especially to obtain an initial DOA estimate. In the case 
of ULAs, it corresponds to averaging the elements along each diagonal of the 
covariance matrix. For NLAs, it requires augmentation for the missing elements. 
Consider an NLA with M sensors. We want to obtain an M v x M v (M v > M) 
covariance matrix, R tc, corresponding to the full virtual array with the same 
aperture as the NLA. Let yjc be the first row of this Hermitian symmetric Toeplitz 
matrix. Once ytc is known, R tc can be easily constructed by Toeplitz comple- 

/V 

tion. An M v xM v sparse sample covariance matrix, R Vfl , is obtained from the 
sensor outputs by assuming that the outputs of missing sensors are zero. Then 
the elements of yjc are obtained: 


Lrc(k) = 


■^M v — (k— 1 ) 
2-w=i 


R va (i,i + (k-l)) 


C (My + k — 1 ) 


k = 1 ,..., M 


V 


(4.27) 


assuming that the co-array c(k) has all nonzero elements. If the co-array has 
zero lags, rjc(k) =0 for that lag. TC is more effective for minimum redundant 
arrays since they do not have any zero elements for the co-array. When c(k) has 
missing covariance lags, interpolation techniques can be employed to fill them. 


Forward-Backward Spatial Smoothing 

FB A and TC methods do not change the size of the covariance matrix. Therefore, 
there is no loss in array resolution. Forward-backward spatial smoothing (FBSS) 
uses the subarrays, and the resulting covariance matrix size is less than the 
original covariance matrix size. This may be a disadvantage, but FBSS is still 
the best method to deal with the coherent sources, especially when ULAs are 
considered. When NLAs are used, TC gives better estimates. Applying FBSS 


after TC m 
FBSS ( 

akes no differen 

3 illai and Kwon 

ce. 

1989) is more efficient than soatial smoothing (Shan 

et al., 

1985 

) in the sense that it requires fewer sensors. The linear array is divided 


into subarrays, and the corresponding covariance matrices are summed to obtain 
a high-rank matrix. There are certain conditions on the number of array elements 
and subarray size. The number of sensors, M, and the number of sensors in the 
subarray, L, should satisfy 


M> lL + n/2-\} (4.28) 

where |_J rounds to the nearest integer, and n is the number of coherent sources. 
The subarray size, L, satisfies 

n<L< \_M-n/2-\- 1J (4.29) 

The minimum value for L is L w / w = n+ 1 and M m i n = [3n/2 \. The FBSS 
covariance matrix, R ft,, can be obtained from the sample covariance matrix, 
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R, for ULAs or from the completed covariance matrix for NLAs. First, the 
forward-backward matrix R is obtained: 


^ 1 a /y rri 

R= — (R+JR^J) 


Then 


1 A ~ 

R fb — p y z /RZ 


T 

k 


k= 1 


where P = M — L+ 1 is the number of subarrays and 


(4.30) 


(4.31) 


Zk = [0lx(£- 1) ILxL ®LxM v -(L+k- 1)] 


(4.32) 


Comparison of Methods for Initial 
DOA Estimation 

Here, FBA, TC, and FBSS are compared for narrowband coherent sources in 
the case of ULA, minimum redundant NLA, and partly filled NLA, respectively. 
The number of snapshots is set to A = 256, and SNR = 10 dB. There are 500 
trials for each experiment. The results with ULA are shown in Figure |4J~1 ULA 
has ten sensors. In Figure 14. IT a) there are two coherent sources, one fixed at 
70°, the other swept between 10° and 170°. An experiment with three coherent 
sources is shown Figure 14.1T b). where two sources at 70° and 95° are fixed 
and the third is swept. From Figure 14.11 it is seen that the FBA returns DOA- 
dependent estimates for two sources, where for certain DOA angles performance 
is unacceptable. FBA does not perform well for three coherent sources. TC has 
an acceptable performance for ULA in the case of a coherent source; it returns 



FIGURE 4.1 Initial DOA estimation for coherent sources by TC, FBA, and FBSS for ULA. (a) One source is 
fixed at 70°; the second is swept, (b) Two sources are fixed at 70° and 95° and the third is swept. SNR = 10 dB. 
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Angle (degrees) Angle (degrees) 


(a) (b) 

FIGURE 4.2 Initial DOA estimation for uncorrelated and coherent sources by TC, FBA, and TC-FBSS 
for NLA. One source is fixed at 70°; the second is swept, (a) Uncorrelated sources, (b) Coherent sources. 
SNR = 10 dB. 


biased estimates but the accuracy is satisfactory for an initial estimate. FBSS has 
the best performance, and it approaches the CRB. 

The second experiment is with a minimum redundant NLA with sensor posi¬ 
tions d^LA = [0 1 4 5 11 13]. FBSS cannot be directly applied to obtain an initial 
DOA estimate. It turns out that applying it after TC (TC-FBSS) does not alter 
DOA estimation accuracy. Figures I4.2r a) and I4.2IT 0 show the DOA estimates 
for different methods for uncorrelated and coherent sources, respectively. There 
are two sources: one fixed at 70° and the other swept. Figure l43l al shows that 
TC and TC-FBSS have similar performance and that TC performs better than 
FBA. When the sources are coherent, as in Figure 03b), TC, TC-FBSS, and 
FBA do not perform well for certain DOA angles, which means that initial DOA 
estimation is a problem in NLA. 

One solution for initial NLA estimation is to use partly filled NLA and 
employ FBSS for the ULA part. Figure 1431 shows the DOA estimate of d^LA = 
[0 1 2 3 8 13 17] when only the first four elements are used. FBSS is applied 
to the covariance matrix of four sensors and an initial DOA estimate is found. 
There are two coherent sources: one fixed at 70°. As seen from Figure l431 partly 
filled NLA has an advantage compared to other types of NLA when the source 
signals are coherent. 

CLEAN Algorithm for Coherent Sources 

DOA estimation in multipath signals is challenging. The covariance matrix is 


rank deficient and does not reflect the multiple-sourc( 

s DOA ang 

es. T 

he CLEAN 

algorithm is well kn 

own 

n astronomv (Clark. 1980l 

: Schwarzi 

1978 

) and radar 

(Abramovich. 19791: 

Dena 

2004: Tsao and Steinberg. 

1988). especiallv for reduc- 


ing sidelobe artifacts. Different versions of CLEAN exist in the literature, but 
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FIGURE 4.3 Initial DOA estimation for two coherent sources by a partly filled NLA. One source 
is fixed at 70°; the second is swept. SNR = 10 dB. 
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the main idea is the same: namely, to remove the strongest signals from the 
observe d data successively. Some of these algorithms operate over the observed 
signals (ITsao and Steinberd.ll988l): others ope rate over the covariance matrix or 
the spatial spectrum (IStoica and Mosesl . 120051) . Their performance and behavior 
are different in general. CLEAN is seldom used in DOA estimation since it can 
offer only marginal improvement for uncorrelated sources. Furthermore, it is 


believed to be suited only for uncorre 

ated sources 

covariance matrix ( 

Stoica and Moses. 

2005). Here. 


rithm that operates over sensor signals. This algorithm works well for correlated 
and coherent sources. In fact, as correlation between sources increases, improve¬ 
ment becomes significant and the best result is achieved when the sources are 
coherent. 

In order to apply CLEAN, initial estimates for the signal parameters should 
be available. Since these estimates are not perfect, the cancellation of signal 
components cannot be perfect and there are residuals. However, CLEAN can 
still perform well for coherent sources as long as the residuals are sufficiently 
small. 

Initial DOA estimates can be obtained in different ways. In our case, TC or 
FBSS techniques are used. Sensor positions are assumed to be known. Therefore, 
the array steering matrix can be constructed once the initial DOA estimates 
are available. The CLEAN algorithm for narrowband DOA estimation can be 
summarized as follows: 
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Step 1. Given the NLA array, obtain the sample covariance matrix and 
apply TC. 

Step 2. Use the covariance matrix obtained in step 1 for initial DOA 
estimation by employing the Root-MUSIC algorithm. 

Step 3. Given the initial DOA estimates 6 = [ 0 \ ,..., 0„], apply CLEAN as 
follows: For each fy, i = 1,..., n: 

1. Construct the M xn array steering matrix A r and find source signal 
s=[A^A r ] 1 A^y r . 

A /V 

2. Construct the M x (n— 1) steering matrix A r j. A r j is obtained from 
A r by deleting the ith column. Also obtain the source s/ from s by 
deleting the ith row. 

/V 

3. Find the cleaned signal y r j = y r — A r jSf and the corresponding covari- 

ance matrix R,-= ^ yr.iy^f 

4. Find the array interpolation matrix T* from Equation (14.231) by using 
A r j and A v j for the real and virtual arrays, respectively, by considering 
K calibration angles in the neighborhood of fy. 

5. Obtain R tj = T/R;T i H . 

6. UseR r , i in the Root-MUSIC algorithm to find the (9* estimate. 

Step 4. Repeat step 3, updating the initial estimate for better results. 


CEE AN is not optimal in any sense, but it can perform as well as MUSIC 
for uncorrelated sources. Note that the signal estimate in step 3.1 is a maximum 
likelihood estimate, and its behavior due to several error sources can be found in 


Weiss and Friedlanded(ll993l) . The cleaning operation in step 3.3 is the orthogonal 

A J-f 

projection onto the null space of AU. Therefore, the measurements are projected 


onto a subspace orthogonal to the signal components except for one. Because the 
sample covariance matrix at step 3.3 is always positive semi-definite (Horn and 


Johnson, Il985i) . the problem of obtaining a positive semi-definite covariance 


matrix in CLEAN algorithms operating over the covariance matrices ( Stoica 
and Moses, 120051) is not observed in our case. Thus, a sample-based CLEAN 
algorithm statistically makes sense. Note that the sample covariance matrix at 
step 3.3 is different from the covariance matrix obtained in CLEAN algorithms 
that operate over the covariance matrices. 

In general, there is not much CLEAN can offer for ULA since FBSS performs 
sufficiently well for coherent sources. In this respect, CLEAN is an effective algo¬ 
rithm for NLA only. It can improve NLA DOA performance significantly. To 
show its performance, three experiments are performed with a minimum redun¬ 
dant NLA with sensor placement cInla = [0 1 4 5 11 13]. SNR is set to lOdB, 
and two uncorrelated sources at 75° and 80° are considered. In Figures l4~4T a) and 
I4.4IT )). the CLEAN spectrum for each source is shown. Initial estimates for the 
NLA, obtained by Toeplitz completion, are 75.1187° and 79.9340°, respectively. 
As can be seen from Figures lT4l a) and !4.4r b). CLEAN can successfully isolate 
the source spectra. 
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(a) (b) 

FIGURE 4.4 MUSIC and estimated CLEAN spectrum for two narrowband uncorrelated sources at 75° and 
80°, respectively, (a) CLEAN spectrum for the source at 75°. (b) CLEAN spectrum for the source at 80°. 
SNR= 10 dB. 



FIGURE 4.5 MUSIC and estimated CLEAN spectrum for uncorrelated sources at 75° and 80°. Initial 
estimates are intentionally given as 73° and 82°. (a) CLEAN spectrum for the source at 75°. (b) CLEAN 
spectrum for the source at 80°. SNR = 10 dB. 


In Figure 14.51 the robustness of the CLEAN algorithm is shown when the 
initial estimates are mistakenly given as 73° and 82°, respectively. After three 
iterations, the resulting DOA angles are 74.8804° and 80.8160°. As can be seen, 
CLEAN can isolate the sources successfully even when the initial estimates are 
not accurate. 

In Figure lA6l the same experiment is repeated for the coherent sources at 75° 
and 80°, respectively. The initial estimates, obtained by Toeplitz completion, are 
74.9292° and 79.9747°, respectively. The MUSIC spectrum does not carry any 
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(a) 



Angle (degrees) 


(b) 


FIGU RE 4.6 MUSIC and estimated CLEAN spectrum for two coherent sources at 75° and 80°. (a) CLEAN 
spectrum for the source at 75°. (b) CLEAN spectrum for the source at 80°. SNR = 10 dB. 


information about the two sources, whereas the CLEAN spectrum clearly shows 
the source DOAs for both. 

4.3.4 Performance Comparisons 

The performance of different algorithms is evaluated for narrowband DOA 
estimation to observe the advantages and disadvantages of each: WAI with the 
estimated signal and noise powers; Wiener array interpolation when the true 
signal and noise powers are used (WAI-TRUE) is the generalized rotational 
signal-subspace (GRSS) method; and CLEAN, described previously. A partly 
filled NLA is used with d^LA = [0 1 2 3 8 13 17]. In the experiments, the 
number of snapshots is N = 256, and the number of trials is 500. 

In Figure 14.71 both uncorrelated and coherent cases are considered when 
there are two sources. One source is fixed at 70°, and the other is swept 
between 10° and 170°. An initial estimate is obtained from the first four ele¬ 
ments by FBSS. Each array-mapping algorithm has three iterations. Figure l4~7l a) 
shows the performance of the algorithms when the sources are uncorrelated. 
CLEAN and GRSS have the best performance, followed by WAI-TRUE and 
WAI. The closely spaced sources are resolved better by GRSS. WAI is more 
sensitive to the initial estimate, and in fact performs better when the initial esti¬ 
mate is obtained via TC. Figure lT7lT )) shows DOA performance when the two 
sources are coherent. All the algorithms apply FBSS when the final covariance 
matrix is obtained. CLEAN and GRSS perform well, followed by WAI-TRUE 
and WAI. 

The same experiment is repeated for two fixed- and equal-power sources 
at 70° and 85°, respectively. In Figure IU8f a) the sources are uncorrelated and 
DOA performance is observed with respect to SNR. Figure I4.8f b) shows the 
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FIGURE 4.7 DOA performance for two sources: (a) uncorrelated; (b) coherent (SNR =10 dB). 



(a) 



(b) 


FIGURE 4.8 SNR performance for two sources at 70° and 85°: (a) uncorrelated; (b) coherent. 


case when the sources are coherent. It turns out that the performance of GRSS 
shows a flooring effect as SNR is increased. As the number of iterations for 
array mapping increases, this effect decreases. CLEAN, WAI, and WAI-TRUE 
are more robust to source coherence as SNR increases. 

The performance of CLEAN and GRSS significantly improves as the number 
of iterations increases. Figure ld~9l shows the performance of the array-mapping 
algorithms with only one iteration. Figure ld~9l a) can be compared with Figure 
14.71 a) and Figure 14.91 b) can be compared with Figure I4.8f a) to see the differ¬ 
ence for one and three iterations, respectively. It seems that CLEAN performs 
slightly better than GRSS when there is only a single iteration. In general, for 
the resolution of closely spaced sources CLEAN is not as good as GRSS. Also, 
its computational complexity is higher than GRSS’s. 
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FIGURE 4.9 DOA performance for two uncorrelated sources when the array-mapping algorithms have a 
single iteration, (a) One source is fixed at 70° and the second is swept (SNR= 10 dB). (b) Two sources are 
fixed at 70° and 85°, SNR performance is observed. 



FIG U RE 4.10 Effect of the number of snapshots for two narrowband coherent sources at 70° and 
85°, respectively. SNR= 10 dB. 


In Figure l4J"0l the performance of the algorithms is shown when the number 
of snapshots is changed. The two coherent sources are placed at 70° and 85°, 
respectively. In this case, the number of iterations for each algorithm is 3 except 
for the GRSS, which is considered for 3 and 9 iterations. The figure shows that 
increasing the number of iterations beyond a certain value does not improve the 
performance of GRSS for the narrow band coherent signal case . 

It turns out that the RSS method i mHung and Kavehl (119881) is very effective 
for array mapping, especially in the case of uncorrelated source signals when 
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FIGU RE 4.11 Array mapping from a 7-element circular array to a virtual ULA. (a) Two uncorrelated sources 
at 70° and 85°. (b) One source is fixed at 70°, the second is swept. SNR= 15 dB. 


the angular sector for DO As is small. As the angular sector for array mapping 
is increased, it has significant bias. When the sensor array is circular, an initial 
DOA estimate is not available and the angular sector should be kept sufficiently 
large. In such cases, WAI can perform significantly better than RSS. To see the 
performance loss for RSS, a 7-element circular array with A/2 inter-element 
spacing is mapped to a virtual ULA with the same number of elements, where 
the elements are spaced by A/5. A 50° sector 6 e [40°, 90°] is selected for the 
array interpolation. 

Figure l47ffl a) shows the SNR performance when there are two uncorrelated 
sources at 70° and 85°, respectively. In Figure lUTTT b). one source at 70° is fixed 
and the second is swept between 30° and 100°. As can be seen, WAI is more 
effective than RSS for this case. RSS has a significant bias when the angular 
sector is large. 

An interesting observation is that the Spectral-MUSIC algorithm applied on 
the original circular array performs worse than WAI when the source DOAs 
are close to each other. This is mainly due to the fact that in Wiener array 
interpolation Root-MUSIC is used. When the error has a significant radial 
component, Spectral-MUSIC fails to show two distinct peaks, wherea s Root- 

MUSIC is unaffected. This phen omenon is analyzed in _ 

and practical c ases are reported in lCovarruhias-Rosales and Olaguel (120041) and 


Teaguel (120021) . 


4.4 WIDEBAND DIRECTION-OF-ARRIVAL 
ESTIMATION 

In practice, most signals are wideband. Narrowband processing of such signals 
ignores the useful information available for parameter estimation. Wideband 
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processing in this context allows better parameter estimation because of the 
processing gain. Wideband processing gain (WBPG) can be defined as the ratio 
of the variance of DOA estimation for one frequency to the variance of DOA 
estimation for L frequencies: 


WBPG = (4.33) 

a e.L 

When the sensor outputs for different frequencies are uncorrelated, a Fisher 
information matrix (FIM) can be written as the sum of the narrowband terms: 

L 

FIM = y] F (Wj) (4.34) 

7=1 

Assuming that the statistics for each frequency are the same, 

i 1 i 

FIM _1 = —F“V) (4.35) 

L 

Therefore, WBPG < L in wideband DOA estimation, which indicates that there 
is significant gain for DOA estimation when wideband processing is employed. 
Figure 0T2]shows the Cramer-Rao bound for a 10-element ULA in the case of a 
single source as the number of frequencies, L, in wideband processing is changed. 


CQ 

s 

> 



FIGURE 4.12 CRB versus number of frequencies used in wideband processing. SNR = 20dB, 
10-element ULA. 
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Wideband DOA estimation can be performed in different ways, which can 
be divided into two main groups: coherent and noncoherent. 

4.4.1 Wideband Signal Model 

In the wideband signal model, the time-bandwidth product in Equation (14.21) is 
not small. In this case, sensor signals are usually treated in the frequency domain. 
The received signal for the zth sensor can be written as 

n 

y n ( t) = 'Y^Sk{t- x/a) + v,-(0 (4.36) 

k =l 


where Sk(t) and Vi(t) are the source and noise signals. Noise is assumed to be 
both spatially and temporally white and uncorrelated with the signal. Equation 
(14.441) can be written in the frequency domain as 


Y ri (co) = J2e~ j(OTtt S k (co) + Vi{oj) (4.37) 

k= 1 


where xn — ^ cos 9k, and c is the propagation velocity. In matrix vector form, 
this equation becomes 


Y r (co)=A r (co) S (<w) + V (co) (4.38) 

It is assumed that there are L frequency bins, Wj, j = 1,..., L, for the bandwidth 
under consideration. The M xn matrix A r (co) is the real array steering matrix at 
frequency co. The M xM sample covariance matrix is 

1 N 

R (co) = - V Y r(co, m)Y?(co, m) (4.39) 

N ^ 

m= 1 

where N is the number of snapshots. 


4.4.2 Signal Correlation and Coherence 


Signal correlation fo r wideband signals is defined slightly differently than for 
the narrowband case (iFriedlander and WeissL 1993 ). Consider two signals s \ ( t) 
and 5*2(0 for simplicity. Let s\(t) be a white signal and 


s 2 (t)=asi(t-r)+g(t) 


(4.40) 
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where g(t ) is another white signal, and a is a constant factor. In the frequency 
domain, 


S2(w)=ae ^ WT S\(w) + G(w) (4.41) 

If we assume that s\(t) and s 2 (0 have the same power (i.e., isjlSiOv)! 2 } = 
£{|S 2 (w)| 2 }), the signal power for g(t) is 

E{\G(w)\ 2 } = (l-a 2 )E{\S 1 (w)\ 2 } (4.42) 


The correlation coefficient between S\(w) and S 2 (w) is defined as 


p(w) = 


£{Si(w)S|(w)} 


/E{|Si(w)| 2 }£{|S 2 (h')| 2 } 




(4.43) 


Let B = wl~w 1 be the signal bandwidth in radians per second. The delay- 
band width product (DBP) is defined as 


Bx 

DBP= — 
In 


(4.44) 


If we assume a nonzero bandwidth, DBP = 0 has different effects depending on 
the correlation coefficient p(w). If p(w) = 1, ^(f) =s 2 (0 and multipath signals 
arrive at the sensors at the same time. 

This is the most problematic case for DOA estimation because the covari¬ 
ance matrix becomes rank deficient. DBP = 1 is the mildest case of corre¬ 
lation. In this case, multipath signals arrive at different times. It is known 


that wideband processing can hand 


e correlated sources e 


ficiently provided 


19931) . For such 


that the DBP is sufficiently large dFriedlander and Weissl . 
cases, the average of the focused covariance matrices in coherent wide¬ 
band processing generates an effect similar to the average of the covariance 
matrices for subarrays in FBSS for the narrowband case. When DBP = 0, 
coherent wideband processing alone is not sufficient to estimate DOA angles 
accurately. 


4.4.3 Noncoherent Wideband DOA Estimation 

The idea in noncoherent wideband DOA estimation is simple. The DOA informa¬ 
tion at every frequency bin is separately processed and the final results combined. 
This is a simple approach, but there are different ways of implementing it. Trivial 
noncoherent DOA estimation finds the DOA estimates at each frequency bin and 
takes their average. It turns out that it is possible to obtain better accuracy than this 
trivial approach. We consider two variants of noncoherent wideband DOA esti¬ 
mation: NCOl and NC02, which use Spectral-MUSIC as a tool. The narrowband 
spectra are summed to obtain a final spectrum of estimation. Since this sum¬ 
mation uses different frequency bins without considering subspace matching, 
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these methods are considered noncoherent. The NCOl estimate is given as 

L—l 

0 = argmin^^SL0(wj)G(wj)G H (wj)sL^ (wj) (4.45) 


e 


j =o 


where L is the number of frequency bins, G (wy) is the noise subspace at frequency 
Wj, and SLe(wj) is the array steering vector. Similarly, the NC02 estimate is 


L_1 1 

0 = argmax > --- rj - 

e “ ae(wj)G(wj)G H (wj)a^(wj) 


(4.46) 


It turns out that NCOl and NC02 perform similarly even though their formu¬ 
lations are quite different. In the following sections of this chapter, we consider 
NCOl as the noncoherent wideband DOA estimation method. 

Noncoherent methods perform well with uncorrelated sources. However, a 
search in a dense grid should be performed to avoid the local minima of the 
multidimensional surface of 0 parameters (14.451) . which makes the computational 
complexity of noncoherent methods high. Noncoherent methods also perform 
poorly for coherent sources—just like the MUSIC algorithm. 


4.4.4 Coherent Wideband DOA Estimation 

The idea in coherent wideband DOA estimation is to coherently combine 
the covariance matrices at each frequency bin in order to apply narrowband 
techniques directly over a single covariance matrix. This is achieved by trans¬ 
forming the covariance matrices at each frequency into a covariance matrix 
at a focusing frequency, such as the center frequency. Thus, coherent methods 
find the transformation matrices for each frequency, perform the transformation, 
and sum the covariance matrices to obtain the final and focused covariance 
matrix. Then they apply narrowband DOA estimation techniques. Clearly, 
coherent methods are computationally more efficient than noncoherent tech¬ 
niques. It turns out that their performance can be better than that of coherent 
methods. 


Wiener Array Interpolation for Coherent 
Wideband Processing 


WAI is discussed in detail in Section l4.3.21 It is possible to extend the WAI for¬ 
mulation to wideband processing. The main difference between the narrowband 
and wideband cases is that there is a different transformation matrix for each 
frequency: 


T(w) = <t^A v (w c )A^(w) (cr^A r (w)A^(w) 4-cTy 1 ) 1 (4.47) 


In addition, as is obvious in Equation (14.471) . two mappings are performed simul¬ 
taneously in the case of NLA. The first is for the transformation of the covariance 
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matrix at frequency w to frequency w c . The second is for obtaining a virtual 
array response from the real array. The application of WAI here is similar to its 
application in the narrowband case. The algorithm steps are as follows: 

Step 1. Given the sensor outputs, Y r (w c ), at center frequency w c , obtain 
the sample covariance matrix R(w c ). Use Toeplitz completion for NLA. Fill 
the missing covariance lags using spline interpolation if necessary and find the 
complete Hermitian symmetric Toeplitz covariance matrix, Rrc(Wc)- 

Step 2. Use R rc(wc) in Root-MUSIC and find the initial DOA estimate 0. 

Step 3. Given 0, construct T {wj) from Equation (14.471) for j = 1, ..., L. 

Step 4. Find R p = ^ Ylf=i T(w/)R(w/)T^ (wj) and use Root-MUSIC to find 
the true DOA estimate. 

Step 5. Iterate steps 3 and 4 by updating the initial estimates. 


GRSS for Coherent Wideband Processing 

A coherent subspace method ( Wang and Kaveh. 1 1985 ) was originally pro¬ 
posed for wideband processing. In lWang and Kavehl dl987l) . CSM performance 
was i nvestigated analytically. The RSS technique was presented in Hung and 
Kaveh (119881) as an effective method for the design of the mapping matrix. In 
Section l4.3.21 RSS was discussed for the narrowband case. Wideband extension 
can be easily done. In this case, the formulation is frequency dependent: 


T(<z>) = argmin\\T{co)A r {co) — A v (co c 
T (co) 


2 

F ’ 


s.t. T h (co)T(co)=I m (4.48) 


where M xK matrix A r (co) stands for the real array steering matrix at frequency 
co, and M v xK matrix A v (co c ) is the array steering matrix for the virtual array 
at center frequency. It is assumed that K calibration angles are used in the 
neighborhood of the initial DOA estimates. The solution for M v x M T(<z>) is 

T(co)=\£\J H (4.49) 


^ rw-i 

where M v x M matrix £ = [I MxM 0] , and M v x M v and MxM unitary matrices 
V and U are obtained from the singular value decomposition: 

A r (co)A^ (co c ) = USV" (4.50) 

The formulation in Equation (14.49b can be used to map arrays with M v > M. 


4.4.5 Coherent Wideband Processing 
for Multipath Signals 

Multipath signals generate different signal scenarios, as described in Section 
14.4.21 When DBP = 0, additional processing is required to correct the rank of the 
covariance matrix. For ULA, FBSS is an effective technique for this additional 
processing; it is applied to the final covariance matrix obtained in coherent 
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wideband processing. For NLA, it is not possible to use FBSS directly. Fur¬ 
thermore, the initial DOA estimation is a difficult problem for coherent source 
signals in this case. Here CLEAN is presented to deal with it. Although intended 
for NLA, CLEAN is effective for ULA as well. 


CLEAN Algorithm for Coherent Wideband 
DOA Estimation 


CLEAN was discussed in detail for narrowband processing in Section 14.3.3 


It is possible to use the same approach in wideband processing with small 
modifications, following these steps: 


Step 1. Given an NLA, obtain the sample covariance matrix at co c and apply 
Toeplitz completion. 

Step 2. Use the covariance matrix obtained in step 1 for initial DOA 
estimation by employing Root-MUSIC. 

Step 3. Given the initial DOA estimates 0 = [ 0 \ ,..., 0 W ], apply CLEAN as 
follows: Take the frequency as coj J = 1,..., L. For each 6(, i = 1,..., n: 

1. Construct the Mxn array steering matrix A r (coj) and find S (coj) = 
[A? (coj) k r (coj )]~ 1 A? (coj) Y r (coj). 

A . A . 

2. Construct the M x (n— 1) steering matrix A l r (coj). A l r (coj) is obtained 

— . /V . 

from A l r (coj) by deleting the ith column. Also obtain the source S l (coj) 
from S(coj) by deleting the ith row. 

. /V . /V . 

3. Find the cleaned signal (coj) = Y r (coj) — A l r (coj)S l (coj) and the corre¬ 
sponding covariance matrix R*(uy) = ^ J2m= l m)Y l r H (coj , m). 

4. Find the array interpolation matrix T l (coj) from Equation (14.491) 
using A l r (coj) and A l v (coj) for the real and virtual arrays, respectively, 
considering K calibration angles in the neighborhood of 0/. 

5. Obtain R l T (ojj) = T (coj)R‘(coj)T(coj) H and R^ = { £f =1 R ‘ T (a)j). 

6. Use in Root-MUSIC to find the 0( estimate. 

Step 4. Repeat step 3 by updating the initial estimate for better results. 


The CLEAN algorithm is very effective for coherent (or multipath) sources. 
The main reason for this is that the accuracy of array mapping increases signifi¬ 
cantly when a single source is considered. Since the main source of error for NLA 
is the array mapping, CLEAN performs well compared to the other algorithms. 
Note that FBSS can be applied to the final covariance matrix after array mapping 
to improve performance for coherent sources. 


4.4.6 Performance Comparisons 

Comparison of the algorithms for the wideband case is similar to that for 
narrowband simulations. The partly filled NLA has a sensor placement of 
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(a) (b) 

FIGURE 4.1 3 DOA performance for two wideband sources: (a) uncorrelated; (b) coherent. SNR = 10 dB. 



FIGURE 4.14 SNR performance for two wideband sources: (a) uncorrelated; (b) coherent. SNR = 10 dB. 


^nla = [0 1 2 3 8 13 17]. The ratio of the signal bandwidth to the 

center frequency is B/f c = 2/3. Eleven frequency bins are used for wideband 
processing. Figure 14.131 shows the performance for two sources: one fixed 
at 70°, and the other swept from 10° to 170°. In Figure l4~l3l a) source sig¬ 
nals are uncorrelated. CFEAN and GRSS have very good performance for 
these. When the sources are coherent with DBP = 0, CFEAN shows slightly 
better performance than GRSS, although GRSS can resolve closely spaced 
sources better. In these experiments array mapping has three iterations for all 
algorithms. 

Figure l4J~4l shows the SNR performance of the algorithms. There are two 
sources at 70° and 85°, respectively. Figure l4J~4T a) shows the case when the 
source signals are uncorrelated. GRSS-3-ITER and GRSS-9-ITER are the GRSS 
methods where array-mapping iterations are 3 and 9, respectively. Iterations are 
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FIGURE 4.15 Performance for two wideband coherent sources in terms of number of snapshots. 
SNR = 10 dB. 


increased for GRSS to show the gain with iterations as well as to show that 
CLEAN may still outperform GRSS significantly for certain cases. When the 
sources are uncorrelated, a small number of iterations are sufficient and GRSS 
and CLEAN perform well. When the sources are coherent with DBP = 0, CLEAN 
performs significantly better. As the number of iterations increases, GRSS gets 
closer to CLEAN. In general, WAI and WAI-TRUE have similar performance, 
and the estimated noise and source powers are sufficiently accurate. 

Figure 0j3] shows the performance of the algorithms in terms of the number 
of snapshots for the same two coherent sources. CLEAN performs better in 
comparison. 

Additional comparisons are done with noncoherent and TOPS algorithms. 
Since these algorithms are computationally intense, the number of trials is 
decreased to 100. In Figure l4A6l a). two wideband sources at 70° and 85° are 
fixed and the third is swept between 10° and 170°. Source signals are uncorre¬ 
lated and the same NLA is used. The initial estimate is obtained by TC. CLEAN, 
GRSS, and noncoherent algorithms perform well, while TOPS does not come 
close to the Cramer-Rao bound (CRB). In Figure I4.16IT 0 three uncorrelated 
sources are placed at 70°, 85°, and 95°, respectively. A similar behavior for the 
algorithm’s performance is observed. 

In Figure l4T7l a ULA with ten sensors is used. An initial estimate is obtained 
directly from the sensor data by the Root-MUSIC algorithm using only a single¬ 
frequency covariance matrix. There are three uncorrelated wideband sources at 
70°, 85°, and 95°, respectively. While CLEAN, GRSS, and the noncoherent 
algorithms perform well, TOPS does not come close to the same performance. 
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FIGURE 4.16 Performance for NLA for three wideband uncorrelated sources, (a) Two sources at 70° and 
85° are fixed; the third source is swep (SNR= 10dB). (b) Sources are at 70°, 85°, and 95°, respectively. 



FIGURE 4.17 Performance for three wideband uncorrelated sources for a ULA with ten sensors. 


Figure l4J~8l shows the DBP characteristics for a ULA with ten sensors. There 
are two sources at 70° and 85°, respectively. SNR = 10 dB, and the initial estimate 
is obtained from a single frequency using FBSS. Neither algorithm uses FBSS for 
the final covariance matrix. CLEAN performs better than the others since it treats 
single sources. GRSS performance improves as the DBP increases. Noncoherent 
and TOPS seem to be insensitive to DBP, which is reasonable. They do not 
perform well for coherent sources. 
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FIGURE 4.18 DBP performance for two coherent wideband sources for a ULA with ten sensors. 



(a) 


FIGURE 4.19 ULA performance for two wideband 
coherent sources (DBP = 0). 



(b) 

sources at 70° and 85°: (a) uncorrelated sources; (b) 


Figure 14.191 shows the performance of the algorithms for uncorrelated and 
coherent sources in the case of a ULA. In Figure 0j3a) two sources at 70° and 
85° are uncorrelated. An initial estimate is obtained from the ULA by the Root- 
MUSIC algorithm for a single frequency. In Figure l4~l9l b) sources are coherent 
with DBP = 0. An initial estimate is obtained using FBSS for a single frequency. 
All the algorithms use FBSS for improving their covariance matrices. CLEAN 
and GRSS perform well for both uncorrelated and coherent cases. Noncoherent 
and TOPS algorithms do not perform satisfactorily. 










































( ^158^ ) Chapter | 4 Narrowband and Wideband DOA Estimation 

4.5 CONCLUSION 

NLAs have certain advantages and problems for DOA estimation. They cover a 
large array aperture with fewer sensors, and they require fewer matched channels 
or receivers. Their disadvantage is that coherent sources cannot be easily handled. 
Furthermore, they need additional computation to compensate for and augment 
the missing sensor information. The completion of missing sensor data is required 
to improve accuracy. Array mapping is an effective method of augmenting the 
NLA covariance matrix. Array-mapping accuracy can be improved significantly 
if an initial DOA estimate is used and then the estimates are improved iteratively. 
For this reason initial DOA estimation is a key problem for NLA, but it can be 
easily solved for uncorrelated sources. Toeplitz completion can be used directly 
for this purpose. Initial DOA estimation for coherent sources is not an easy task. 
While TC and FBA return good estimates for certain DOA angles, they do not 
perform well for others. 

A promising approach for coherent signals is to use partly filled NLA. Initial 
DOA estimates can be obtained via FBSS for the ULA part of this array. Then 
array mapping can generate a covariance matrix corresponding to a full array 
with the same aperture. Different array-mapping techniques exist in the literature. 
Classical array interpolation is well known, but it has certain limitations. Wiener 
array interpolation performs well, but it produces focusing loss for NLA. It 
performs better for circular arrays compared to alternative techniques where a 
large angular sector for array mapping is used. Array mapping with the RSS 
and CLEAN algorithms works well for a variety of cases as long as the angular 
sector is small. In this respect, RSS and CLEAN are good candidates for DOA 
estimation for NLA. 

Wideband DOA estimation has a processing gain proportional to the num¬ 
ber of frequency bins that carry information about the source signal. In this 
respect, it can improve DOA accuracy compared to narrowband processing. The 
main problem is computational complexity. Noncoherent techniques generally 
perform well for uncorrelated sources, but computational complexity increases 
with the number of frequency bins. Coherent wideband processing is an effective 
and computationally efficient alternative. 

When array mapping is used to cohere the covariance matrices, narrowband 
techniques can be applied on a single covariance matrix. GRSS and CLEAN 
methods perform well even with coherent source signals. They allow FBSS to 
be used in the final covariance matrix, and as a result they can be employed for 
both uncorrelated and coherent wideband sources. They also perform better than 
the noncoherent method. These techniques can be used for both ULA and NLA. 
Since array mapping is applied more effectively when an initial DOA estimate 
is used, partly filled NLA can be used for wideband DOA estimation when there 
are multipath signals. 
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Search-Free DOA Estimation 
Algorithms for Nonuniform 
Sensor Arrays 


Michael Riibsamen, Alex B. Gershman 


5.1 INTRODUCTION 


Over the last four decades, a large number of methods for estimating the direc¬ 
tions of arrival (DOA) of multiple narrowband far-held signal sources has been 
proposed. Important characteristics of these methods are their DOA estimation 
mean-square error (MSE) performance, computational complexity, and required 
array geometry specifications. Many techniques are applicable only to quite spe¬ 
cific array geometries, such as uniform linear arrays (ULAs) or uniform circular 
arrays (UCAs). However, it is often unsuitable in practical scenarios to restrict 
the array geometry to a certain class. Therefore, methods that can be applied to 
arbitrary nonuniform array geometries are in great demand. 

Among the known DOA estimation methods applicable to arbitrary arrays, 


those based on the maximum likelihood (ML) principle have been shown to 

achieve the best ]V 

[SEpe 

rformance ( 

Bohme. 

198 

6: Jaffer. 1988:ISchweppe 

1968; 

Wax and Kailathl 

1983; 

Ziskind and Waxlll988 

). However, their computational 


complexity is usually prohibitively high because a multidimensional search is 
required to find the global maximum of the nonconvex likelihood function. 

Several computationally efficient high-resolution DOA estimation meth¬ 
ods applicable to arbitrary arrays have been proposed based on the concept 
of signal and noise subspaces. One of the most popular subspace-based tech- 
niau es is the Multiple-Signal Classification (MUSIC) algorithm f Bienvenu and 
Kopp, 1980l : Schmidt . 1979b . Compared to ML techniques. MUSIC leads to a 
spectral search over a reduced parameter space and therefore offers dramati¬ 
cally lower computational complexity. However, the complexity of the spectral 
search in the conventional MUSIC algorithm may still be too high for real-time 
applications. 


Classical and Modern Direction-of-Arrival Estimation 
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The lowest computational complexities in subspace-based techniques are 
achieved by so-called search-free DOA estimation methods. We provide a 
detailed overview of search-free techniques in this chapter, with a particular 
emphasis on nonuniform sensor arrays. 


The oldest and perhap s the most popu 


niques are Root-MUSIC (IBarabellL 11983 


ar search-free DOA es timation tech- 
; iRao and Hari . 1989 ) and ESPRIT 


(Esti mation of Signal Parameters via Rotational Invariance Technique: Paulrai 
et al., 986 : Rov and Kailath . 989 : RovetaL, 9861) . However, the latter 


two can only be applied to arrays with very specific geometries. In partic¬ 
ular, Root-MUSIC applies only to ULAs and nonuniform arrays (NUAs) of 
which the sensors lie on a uniform grid, whereas ESPRIT requires arrays that 
consist of two identical and identically oriented subarrays. Several useful exten¬ 
sions of Root-MUSIC and ESPRIT to more general classes of array geometries 
have bee n proposed, such as UCA Root-MUSIC. UCA ESPRIT (Mathews an d 
Zoltowski. il994 ). Rank Reduction Estimator (RAR E; Pesavento et al. . 20021) . 
and generalized ESPRIT ( Gao and Gershman . 12005 ). 

There has been a growing interest in developing search-free DOA estima¬ 
tion techniques for a rbitrary array geometries . The array in t erpolation approach 
( Friedlander . 1993 : Friedlander and Weiss . Il992 . 1 1993 : Weiss etall 1995b 


employs the idea of approximating any actual NUA by a virtual ULA. Then 
any ULA-based search-free method can be applied to the virtual ULA observa¬ 
tions, including the standard Root-MUSIC algorithm . The m anifold separation 
(MS) approach ( Belloni et al. . 2007 : Doron and Doron . 1994a b|c) extends Root- 


MUSIC to arbitrary array geometries by modeling the received wavefield via an 
orthogonal expansion that approximates the true array steering vector as a prod¬ 
uct of a matrix depending only on the array parameters and a Vandermonde vector 
depending only on the angle. The Vandermonde structure of the latter vector is 
then exploited to obtain a polynomial that can be rooted to estimate the source 
DOAs. 


Another recent approach to search-free DOA estimation for 


Fourier Domain (FD) Root-MUSIC (iRubsamen and Gershman . 


rbitra 


2008, 


~y arra ys is 


2009b . FD 


Root-MUSIC exploits the fact that the null-spectrum MUSIC function is periodic 
in angle. More specifically, it uses the truncated Fourier series expansion of this 
periodic function to reformulate the DOA estimation problem in terms of poly¬ 
nomial rooting rather than spectral search. FD Root-MUSIC is related to the MS 
technique, but it has been shown to achieve substantial performance improve¬ 
ments in the asymptotic domain at no additional cost. Further refinement of this 
technique is the FD-Weighted Lea st-Squares (FDWLS) Root-MUSIC algorithm 
dRtibsamen and Gershmanl.l2009b . which uses a weighted least-squares approx¬ 
imation of the MUSIC null-spectrum function to improve the DOA estimation 
performance. 

This chapter is organized as follows. In Section 15^21 we introduce our nota¬ 
tion and provide some necessary background on the DOA estimation problem. 



































































































Background 


Then in Section l53l we focus on search-free DOA estimation methods applicable 
to specific array geometries. First, classic search-free subspace-based tech¬ 
niques are discussed such as Root-MUSIC and ESPRIT. Next, more recent 
search-free direction-finding methods are considered that extend the traditional 
Root-MUSIC and ESPRIT approaches to several useful classes of nonuniform 
sensor arrays. These methods include the UCA extensions of Root-MUSIC, the 
generalized ESPRIT methods, and the Root-RARE techniques. Section 15.41 is 
devoted to search-free DOA estimation methods applicable to arbitrary array 
geometries; we discuss the interpolated Root-MUSIC technique, the MS method, 
and the family of FD Root-MUSIC algorithms. 

Simulation results are presented in Section 15.51 where performance of the 
search-free methods for arbitrary arrays is compared. 

5.2 BACKGROUND 

We assume L narrowband far-field signal sources that impinge on an array 
of A omnidirectional sensors. The unknown DO As of the signals are 
( 01 ,0i),..., (Ol, 0l) ? where Ol and 0 l are the azimuth and elevation angles 
of the Lth source, respectively. The A x 1 array snapshot vector at time k can be 
modeled as 


x(k)=A(0)s(k)+n(k) 

nr 

where 0 = [0\ , <p\ ,..., Ol, <Pl] is the 2L x 1 vector of signal DO As, 

A{0) = [a(0\, 0i),.. 0 l)] 


(5.1) 


(5.2) 


is the A x L signal direction matrix, s(k) is the Lx 1 vector of signal waveforms, 
n(k) is the A x 1 vector of sensor noise, and (*) r is the transpose. 

The A x 1 steering vector is given by 


a (6, </>) = 


exp |y^(xi sin0sin0+yi cos0sin0 + zi cos</>) j 
exp \j (xn sin 0 sin 0+ y n cos 6 sin 0 + zn cos 0) 


(5.3) 


where X is the signal wavelength, j = yf—l, and {v/,y/, Zi) are the coordinates of 
the zth array sensor. In the sequel we assume, unless specified otherwise, that the 
sensor array manifold is known. 

If the signal and noise vectors are zero mean, and if the noise is spatially 
white and statistically independent of the source signals, then the A x A array 
covariance matrix can be written as 


R x =E{x(k)x H (k)) =AR s A h +o 2 I n 


( 5 . 4 ) 
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where R s = E{s(k)s H (&)} is the source covariance matrix, a 1 is the sensor noise 
variance, In is the N xN identity matrix, E{-} denotes the statistical expectation, 
and (-) H denotes the Hermitian transpose. 

The eigendecomposition of R x yields 

N 

= (5.5) 

k =l 

where and ek (k= 1,..., N) are the eigenvalues and corresponding eigen¬ 
vectors of R x . Let the eigenvalues Xk be sorted in nonascending order. 
Then the matrices 


Es — [e i, ...,e L ], EN = [eL+ 1, —,^iv] 


(5.6) 


contain the L signal-subspace and A — L noise-subspace eigenvectors, 
respectively. 

In practical situations, the exact array covariance matrix R x is unavailable, 
so its sample estimate 

1 K 

R x =-j2 x(k)xH<k) ^ 5 - 7 > 

A k= l 


is used, where K is the number of snapshots. 

The eigendecomposition of the sample covariance matrix (15.71) yields 

yv ~ ~ * H 

Rx=Es^sE s +EnAnE n 


(5.8) 


where the sample eigenvalues are again sorted in nonascending order (X\ > 
X 2 > .. .>Xn), and the matrices = [e \,..., ei\ and ^ = [£l+i , • • •, 2#] 

A 

contain in their columns the signal- and noise-subspace eigenvectors of R x , 
respectively. Correspondingly, the diagonal matrices As = diag{Ai,..., Xj_] and 

-"‘A ^ ^ 

An = diag{A.L+i,..., A^} are built from the signal- and noise-subspace eigen- 

A 

values of R x , respectively. __ 

The conven tional MUSIC null-spectrum function ( Bienvenu and Kopp . 1980l : 
Schmidt! [l979ll can be expressed as 


m (p) =a H (6, (P)E N E H N a(9 ,</>) = ||£?a(0, <j>) || 2 


(5.9) 


where || • || is the vector-2 norm. The Spectral-MUSIC technique estimates the 
signal DOAs from the minima of this function by searching over 0 and 0 using a 
fine grid. The computational complexity of this spectral search step is typically 
substantially higher than that of the eigendecomposition step because, as a rule, 
J^>N, where J is the total number of spectral points. Note that for each spectral 


-// -// 

point, the product of E N and a(0, 0) (or, alternatively, of E s and a(0, 0)) has to 
be computed . The statistical performance of MUSIC has been studied bv Kaveh 
and Barabell (I1986t) and Istoica and Nehorail (119891) . where it has been shown 
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that at large values of K and N, and for uncorrelated signals, this algorithm 
asymptotically achieves the deterministic Cramer-Rao bound (CRB). 


5.3 SEARCH-FREE METHODS FOR SPECIFIC ARRAY 
STRUCTURES 


Historically, search-free DOA estimatior 
specific array geometries ( Barabelll . 1983 


methods were origi nally proposed for 


Paulrai et al.l.ll986i) . Search-free tech- 


niques applicable to arbitrary array geometries were develop 

3ed later, often based 

on exl 

tensions of the earlier methods 

Bello 

ni et al 

.. 2007 

; 

Doron and Doron. 

1994a 

|b||c;lRiibsamen and Gershman. 

2008[ 

2009). 

In this section, we present 


an overview of search-free DOA estimation methods that require specific array 
geometries. 

In the rest of this chapter, we will consider the one-dimensional DOA estima¬ 
tion problem, assuming that the sensors and sources are located in the xy-plane 
of the coordinate system. In this case, we have to estimate only the vector of the 
source azimuth angles 0 = [0 \,..., 0l] t . 


5.3.1 Root-MUSIC 

As mentioned before, the Root-MUSIC technique (lBarabell . il 983 is applicable 
to ULAs or to linear arrays with sensors that lie on a uniform grid. For the sake 
of simplicity, we consider the ULA case in this section. Let the array sensors 
be located on the v-axis of the coordinate system and centered with respect to 
the origin. Then, with a slight abuse of notation, the steering vector (15.31) can be 
written as 


a(0) = 


exp{ 

j N 2 12 * d x smd\ 


1 

1 

_1 

exp{ 

j N 2 3 7 d x sm 0 J 

_ 

N—3 

Z 2 

exp 

{j N 2 l2 x dxSm6 } J 


N- 1 

Z 2 


a(z) 


(5.10) 


where z — exp {j (In/X)d x sin#}, and d x is the array inter-element spacing. 
Using Equation (15.101) . the MUSIC null-spectrum function can be written as 


f(6)=a H (6)E N E H N a(6) 

= a r (\/z)E N E H N a(z)=f{z) 


(5.11) 


The polynomial /(z) has 2 (N — 1) roots, which form conjugate reciprocal pairs. 
That is, if any zo is a root of /(z), then 1/zq is its root as well, where (•)* 
denotes the complex conjugate. In the noise-free case, the polynomial/ (z) has L 
pairs of roots Zi = exp {j (2n/X) d x sin 0 ;}, i = 1 ,..., L, and there are 2 (N — L — 1 ) 
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additional “noise” roots. In the presence of noise, the root locations are distorted, 
but the signal DO As can be estimated from the roots of f(z) that lie closest to the 
unit circle. Because of the root conjugate reciprocity property, the roots inside 
the unit circle contain all information about the signal DO As. Thus, the Root- 
MUSIC algorithm computes all roots of/(z) and estimates the signal DO As from 
the L largest-magnitude roots inside the unit circle. 

Several modifications of Root-MUSIC have been proposed that offer 
impr oved performance and/or reduced computational complexity (see Pesavento 
et al.. l2QQ0l and lZoltowski et allll993l and references therein). 

Clearly, Root-MUSIC is also applicable to any linear array with sensor dis¬ 
placements that are integer multiples of a common baseline d x . This enables its 
application to sparse linear arrays. 


5.3.2 ESPRIT 


The ESPRIT algorit hm was proposed bvlPaulrai et al.l 19861) 
oped a nd studied bv lOttersten et al.l ( 


199 lb . Rov and Kailath ( 1989b . Rov et al. 


nd further devel- 


(11986b . It is applicable to sensor arrays that consist of two identical and identi¬ 
cally oriented subarrays, with one being a shifted “replica” of the other, where 
the displacement vector between the subarrays has to be known, whereas the 
geometry of each subarray may be unknown. 

The steering vectors for the two subarrays can be written as 


a\(0) =Jia(6), a 2 ( 0 ) =J 2 ci(0) 


(5.12) 


where J\ and J 2 are the selection matrices having in each row a single entry 
equal to one and all the others zero. Without any loss of generality, we assume 
that the two identical and identically oriented subarrays are displaced along the 
x-axis of the coordinate system. Then the steering vectors for the two subarrays 
are related as 

a l (d)=a 2 (0)e’ (2n/x)d * sine (5.13) 


where d x now denotes the displacement between the two subarrays. Similarly, 
the two subarray steering matrices 

Ai(0) =JiA(0), A 2 (0) =J 2 A(0) (5.14) 


are related as 

A l (O)=A 2 (O)Q(0) 

(5.15) 

where 

Q(0) 4 diag { e j (2yr/x) d * sin 01 ,..., e j (2jt/x) d * sin 9l ) 

(5.16) 


Note that the signal DO As can be obtained from the diagonal elements of Q(0). 

For incoherent signal sources, the columns of Es and A(0) span the same 
subspace. Therefore, 


E S =A(0)T 


(5.17) 
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where T is nonsingular. From Equations (15.151) and (15.17b . it can be readily 
shown that 


Esi=J i Es=J i AT=A l T=A 2 QT = Es 2 T- 1 QT (5.18) 

where Es 2 =JiEs- Clearly, the matrices = T 1 OT and Q have the same 
eigenvalues. Therefore, the ESPRIT algorithm proceeds in two steps. First, an 
estimate of ^ is computed. Second, the signal DOAs are estimated from the 
eigenvalues of the latter matrix. 

One simple way to estimate ^ is to use the least-squares (LS) approach: 

^ = argmin \\E S i -E S2 ^\\f =e1 2 E S i (5.19) 

\jr 


A A A A . 

where ||-1 |p denotes the Frobenius norm, Esi =J\Es,Esi =JiEs, and (•)' stands 
for the pseudo-inverse. 

The minimization in Equation (15.19b can be written as 


A A 


min 

Esi — Esi VK 

e, = min 

Z i 



F ^.z. 



s.t. Esi +Zi =Es 2 ^ 


(5.20) 


Thus, the matrix Esi is considered to contain errors, whereas Esi is interpreted 

A A 

to be error-free. In practice, both Esi and Esi contain errors, so improved DO A 
estimates can be obtained bv the total least-squares (TLS) approach (Ottersten 


et al.. ll99ll : lRov and KailathU 19891) . which solves 


min ||[Zi,Z 2 ]|| F s.t. E S \ +Z\ = (E s2 +Z 2 )'H (5.21) 

^,Zi,Z2 


The solution to Equation (15.21b can be written as 


^ = —MuM j 2 


(5.22) 


where the LxL matrices M n and Mu follow from the eigenvalue decomposi¬ 
tion of the matrix 



(5.23) 


In Equation d5.23b the eigenvalues are sorted in nonascending order, that is 
A = diag {Xi ,..., X2l}, with > \i > ... > lu¬ 
ll has been shown that, for many cases of practical interest, the asymptotic 


perfor mance of TLS- 


arrays (lOttersten et al 


SPRI T is close to the CRB for fully calibrated sensor 


19911) . For centro-symmetric sensor arrays (e.g., ULAs) 


forward-backward (FB) averaging can be applied to achieve lower computational 
comp lexity and improved DO A estimation performance (IHaardt and Nossek . 


19951) . 


The concept of ESPRIT can also be extended to sensor a rrays that consist o 


more 


200d: 


han two identical and ide ntically oriented subarrays (ISidiropoulos et al 


Swindlehurst et al 


19921) . 
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5.3.3 UCA Root-MUSIC 

UCAs are often preferred to ULAs because of ULA direction-of-arrival ambigui¬ 
ties and strongly varying beamshapes. Although for UCAs neither Root-MUSIC 
nor ESPRIT can be directly applied, search-free DOA estimation algorithms can 
be ob tained using the phase mode excitation theory (IMathews and Zoltowskil. 
1994b . 

Let the nth UCA sensor be located at 


x n = rsin ( In- 


n— 1 


y n = r cos ( 2tt 


n — 1 


Z n = 0 


(5.24) 


A / V N 

where r denotes the array radius. Substituting Equation (15.241) into Equation (15.31) 
yields 


a (6) = 


e j x r cos ( 0 ) > e J x r cos ( 27r v _6) ) 


j ^rcos(27r Ax — 0 ) 


N -1 


nr 


, c * 




(5.25) 


The excitation of the mth phase mode 

1 r 


w m = — 1, e 

A L 


■j^ e 

9 • • • 9 ^ 




(5.26) 


by a signal with steering vector a(0) is given by 

N-\ 

w:,aw) = — > e j2n! % i e j2 ? rcos(2n %- e) ^j m J l 


n =0 


m 


2tv 

T 


r e 


jmO 


(5.27) 


if |m| < A/2. In Equation (15.27b . / m (-) denotes the Bessel function of the first 
kind of order m and the approximation becomes arbitrarily tight for large A. It 
is a common rule of thumb that J m (y) ^0 if \m\> y; that is, essentially only 
phase modes with \m\ < 2ixr/X are excited. For m = — M + 1,..., M — 1, where 
M < min{A/2, 2nr/X }, we obtain from Equation d5.27b that 

W H a(0)^v(0) 

where the /th row of the 2 M — 1 x A matrix W H is given by 

w H 

(W H ) =_ -M+l _ 

V //,: j~ M + l J_ M+ [(2jir/\) 

and the 2M —lxl Vandermonde vector v(0) is 


(5.28) 


(5.29) 


v(0) = 


y(-M+l)0- 

e K-M+2)0 

e W -\)0 


-M+ V 
-M+2 


Z 


M—1 


v(z) 


(5.30) 


Note that in Equation (15.30b we use the definition z = exp{j'0}, which is different 
from that used in Section 15.3.11 The Vandermonde structure of v(0) enables 
direct application of Root-MUSIC, ESPRIT, and FB averaging techniques. In 
what follows, we focus on UCA Root-MUSIC. 
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From Equations (15.41) and (15.28b . it follows that 

W H R x W^V(0)R s V H (0) + a 2 W H W (5.31) 

where V ( 0 ) = [v(0i),..., v(Ql)]. In the finite sample case, after noise prewhiten¬ 
ing, the eigendecomposition of the following matrix can be used: 


(W H Wr' /2 W H R x W(W H Wr l/2 =E c ,s^c,sE 


H 

C,S 


H 


(5.32) 


+Ec,nAc,nE c ? 


N 


where the c olumns of Ec.n span the noise subspace. The UCA Root-MUSIC 
polynomial ( Mathews and Zohowskil[l994 ) can then be defined as 


fc(z) = v T (l/z)(W H Wr l/2 Ec, N Ec, N (W H Wr l/2 v(z ) 


(5.33) 


As in Root-MUSIC, the DO As can be estimated from the largest-magnitude roots 
of fc(z) lying inside the unit circle. 

It is important to note that the approximation in Equation (15.28b leads to 
biased DO A estimates. The bias can be reduced by increasing the number of 


sensors N (IMathews and Zolto wskil. 19941) . 


5.3.4 Root-RARE 


The search-free Root-RARE estimator (IPesavento et all 120021) is a general¬ 
ization of the Root-MUSIC algorithm for partly calibrated sensor arrays. This 
algorithm is applicable to sensor arrays that consist of several linear, identically 
oriented, calibrated subarrays with inter-element spacings of each subarray being 
integer multiples of a common baseline. Therefore, RARE is applicable to an 
array composed of several colinear Root-MUSIC-type arrays with inter-subarray 
displacements that can be unknown. Figure |5j]depicts an example array geom¬ 
etry where the crosses represent the sensor locations and the lines indicate the 
apertures of calibrated subarrays. 

Let us consider a sensor array in the xy-plane of the coordinate system that 
consists of M linear identically oriented subarrays. Without loss of generality, 
we assume that all of the subarrays are parallel to the v-axis. The rath sub¬ 
array consists of N m sensors, and the unknown xy-coordinates of its first sensor 
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FIGURE 5.1 Example array geometry for Root-RARE. 
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are (ot x ,m,(Xy,m)- Thus, the location of the rth sensor of the rath subarray can 
be expressed as (a Xtm + Km,rd x , where K mj is the integer multiple of the 
common baseline d x . The N m x 1 steering vector of the rath subarray is 


a m ( 6 , a Xim , a y , m ) = bm{ 0 ) e ^^^ +ay , mC ose) 


(5.34) 


where 


b m (0) = 


1 


1 

e j K m,2 x d x sin 6 

_ 

^m, 2 

• 

r\ 

e jK m ,N m -fd.smO 


r^m.Nyn 




(5.35) 


and where we return to the notation z — exp {j( 2 n/X)d x sin#}. 

Equation (15.341) leads to the following expression for the steering vector of 
the whole array: 

a(0,ct)=B(0)h(0,a) (5.36) 

where a = [a Xj i,cc yi i,< x x ,m, %,m] stacks all the unknown subarray 
locations, 


*#n(0, a) = 


JyKl sin0+cif yf icos0) 


e J * 


e j x sin O+dy^M COS 9) 


(5.37) 


and the N xM matrix B(0) is defined as 


B(0) 


bm 

®Nixl 

®N\ x 1 

0x 2 x 1 

*2(0) 

0x 2 X1 

0jV M xl 

0iV M xl 

b M ( 0 ) 

b\(z) 

O^ixl 

O^i x 1 

0x 2 xl 

bi (z) 

0n 2 X1 

0jV M xl 

0a^ m x1 

*m(z) 


(5.38) 


= B(z) 


where 0^ x / denotes a k x / matrix of zeros. The steering vectors of the true signal 
DOAs are orthogonal to the noise subspace, so it follows from Equation (15.361) 
that 


II E%a(0i,a) || 2 =h H ( 6 ,, a)BP {9i)E N E%B{9i)h{9i, a) = 0, 

1=1,...,L. (5.39) 

The RARE algorithm exploits the fact that the matrix 


Z(9) ±B(9) H E N E"B(9 ) 


iH 


N J 


(5.40) 


does not depend on a and that, according to Equation d5.39|). this matrix is 


rank deficient for 6 = 61 , l = 1,..., L. The RARE null spectrum (IPesavento et al. 
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20021) can be defined as 


fRARE(d) = det (B H (d)E N E N B(0)) (5.41) 

so that the source DOAs can be estimated from the minima of this function. 
Similar to Root-MUSIC, the spectral search can be replaced by rooting the 
polynomial 


H 




/rare (z) = det (B 1 (1 /z)E N E N B(z)) 


(5.42) 


It has been shown bv lPesavento et alJ (120021) that, similar to the Root-MUSIC 
case, the roots of /rare (z) form conjugate reciprocal pairs. Consequently, the 
source DOAs can be estimated from the largest-magnitude roots lying inside the 
unit circle. 

A modification of the R ARE technique that is rob ust to subarray misorienta- 
tions has been proposed bv lAbd Elkader et alJ (120061) . Sever al other search-free 
approa ches have been developed for subarray-based arrays bv ls windlehurst et ah 

(2007b. 


5.3.5 Generalized ESPRIT 

The generalized ESPRIT algorithm developed bv lGao and Gershman (12005 ) is an 
extension of the standard ESPRIT algorithm for a broader class of array geome¬ 
tries. In contrast to ESPRIT (which requires identical displacements between 
any pair of corresponding sensors of the first and second subarray), general¬ 
ized ESPRIT accounts for different collinear displacement vectors the lengths 
of which must be integer multiples of a common baseline. Figure 15^21 shows an 
example array geometry for the generalized ESPRIT algorithm. 

As in the conventional ESPRIT case, let all the array sensors lie in the xy- 
plane, and let the displacements be parallel to the x-axis of the coordinate system. 
Using the same notations as in Section 15.3.21 the steering matrices of the two 
subarrays can be written as 

A\(0) = [a(0 \),... ,a(0l)] , ^ 2 (0) = [T(0\)a(0i ),..., T(^l)«(^l)] (5.43) 


where 


r ( 0 ) 


diag | 


K\d x sin# 


K Md x Sin# 


e J * 


1 


(5.44) 
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FIGURE 5.2 Example array geometry for generalized ESPRIT. 
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M is the number of sensor pairs, and K m , m = 1,..., M are the integer multiples 
of the baseline d x such that K m d x is the displacement between the two sensors of 
the rath pair. Thus, we obtain 

Esi ~ T(0)Esi =A 2 T - r (Q)A!T = Q(0)T (5.45) 

where 

0 ( 0 ) = [(r(0i) - nmm, ..., (r(e L ) - r (0))fl(0 L )] (5.46) 


Esi =J\Es , and Es ’2 =J 2 Es. 

The generalized ESPRIT algorithm is based on the observation that Q(0) 
becomes singular for any 0 = 0i, l = 1,..., L. Therefore, its null spectrum can be 
defined as 

/GESPRix(tf) = det { (Esi - r (d)Esi) H (E S2 - r(0)£ S i)) (5.47) 

By defining 

T(z) = &iag{z K \...,z KM } (5.48) 


with z = exp {j(2 tt/X) d x sin^}, the following polynomial can be obtained from 
Equation (15.471) : 

/GESPRir(z) = det((£^-£Sr(l/z)) (e s1 - V(-)E S] ) \ (5.49) 


The source DO As can be estimated from the roots of/GESPRufe) in the same 
way as for the standard Root-MUSIC technique. 

Note that our definition of the gener alized ESPRIT null- s pectru m function 
is somewhat different from that used in I Gao and Gershmanl (12005b . The null- 
spectrum in Equation (15.471) is real-valued and positive, and the roots of 
/gesprit( 7 ) appear in conjugate reciprocal pairs. 


5.4 SEARCH-FREE METHODS FOR ARBITRARY ARRAYS 

In this section, we provide an overview of search-free DOA estimation methods 
that can be applied to arrays of arbitrary geometry. 


5.4.1 Interpolated Root-MUSIC 


The essence of the arrav intemolatio 

n techniaue ( 

Friedlander. 

1993 

: Friedlander 

and Weiss. 1 1992 

.1993: 

Weiss et al.. 

1995b is to approximate the steering vector 


of the actual sensor array as 


a(0)-G lgl (0) (5.50) 

where g\{0) is the M\ x 1 steering vector of a virtual ULA, M\ is the number 
of virtual sensors, and G\ is the N x M\ array interpolation matrix designed to 
minimize the interpolation error. For example, if the virtual sensors are aligned 
along the v-axis of the coordinate system and centered with respect to its origin, 
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the “virtual” steering vector is described by Equation (15.101) . with N replaced by 
M\ and with d x being the inter-element spacing of the virtual ULA. 


The MUSIC null-spectrum function (15.91) can thus be approximated as 

m -gi {e)G^E N E H N G xgl {6) H (0) (5.51) 

which can be expressed in terms of z = exp {j(27t/X)d x sin#} as 


Mz)=g[(l/z)G^E N E N G lgl (z) 


H 


(5.52) 


This is a polynomial of degree 2{M\ — 1) the roots of which appear in conjugate 
reciprocal pairs. As with Root-MUSIC, the signal DOAs can be estimated from 
the largest-magnitude roots located inside the unit circle. 

Because the approximation in Equation (15.501) is typically inaccurate for the 
whole array angular field of view, angular sectors have to be defined and such an 
approximation has to be used separately for each sector. The number of sectors, 
the number of virtual sensors, and their locations are the design parameters of 
the interpolated Root-MUSIC method. 

5.4.2 Manifold Separation 

Another elegant Root-MU SIC method for arbitrary arrays was proposed in 


Doron and Doron 


ill 994allhB) and further analyzed in iBelloni et al.1 J2007I) . In 
Doron and Doronl (I1994al) . it was shown that for any arbitrary array the steering 


vector can be approximated as 

a(0)-G 2 g 2 (0) (5.53) 

where G 2 is an N x M 2 matrix that depends only on the array parameters and 


g 2 (0) = 


> j 1 v 


,M2z2q1 T 


,e J 


is an M 2 x 1 Vandermonde vector that depends only on 0 and M 2 . The elements 
of g 2 (0) form a subset of the Fourier basis, and the parameter M 2 characterizes 
the accuracy of the approximation in Equation (15.531) . Given the completeness 
of the Fourier basis, Equation (15.531) becomes exact when M 2 —> 00 , and the 
accuracy of the approximation improves when increasing M 2 . Thus, sector-wise 


Usine the resi 

nits of 

Doron and Doron 

1994a 

). it was nroDOsed bv Belloni 

et al. (2007 

) and 

Doron and Doron 

(19941 

)) to use Equation (|5.53) with some 


finite M 2 to approximate the MUSIC null-spectrum function as 


m ^(e)G^E N E N G 2 g 2 (0) —f 2 (0) 


iH 


H 


=g 2 {\/z)G^E N E H N G 2g2 (z) =f 2 (z) 


(5.54) 


where, in contrast to Equation (15.521) , z= exp{ jO} and the degr ee of the poly¬ 


nomial is 2 M 2 — 2. It was suggested in lDoron and Doron 


( 1994bl) that the signal 
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DO As be obtained from the largest-magnitude roots of fa (z) located inside the 
unit circle, in a way similar to that used in the standard Root-MUSIC algorithm. 

Several methods have been proposed to compute the matrix G 2 , including 
the least-squares method and another technique that determines each element of 
G 2 via the inverse discrete Fourier transform (IDFT) of different components of 
the steering vector a(6) taken at different angles. The LS technique is optimal in 
the sense that it minimizes the manifold approximation error. 

The parameter M 2 should be large enough to obtain an acceptable DOA esti- 


mat ion performance. T 
M 2 ( Doron and Doroni 


le following rule of thumb was suggested for choosing 


1994ai) 


M 2 ^ 87 rr/X 


(5.55) 


where r is the largest distance between the array sensors and the origin of the 
coordinate system. 


5.4.3 Fourier Domain Root-MUSIC 


The FD Root-MUSIC technique of lRubsamen and Gershmanl (I2008LI2009I) uses 
a different approach to obtain a polynomial that approximates the MUSIC null- 
spectrum function. It exploits the fact that the MUSIC null-spectrum function 
(15.91) is periodic in 0 with the period 2tt. Therefore, the latter function can be 
expressed using its Fourier series expansion as 

00 

m= £ Fmejm 9 (5 - 56) 

m=—o o 

where the Fourier coefficients are given by 

71 


Fm — 


T J f(6)e-i me dd 


(5.57) 


-7T 


Truncating the Fourier series in Equation (15.56b to 2 M 3 — 1 points, the function 
f(6) can be approximated as 

m 3 -i 

m - £ Fm ejm6 

m=— (M 3 — 1) 

M 3 -l 

= £ F m z m =f (z) 

m=— (M 3 — 1) 


(5.58) 


where the notation z— exp [jO] is used. It can be seen that, if M 2 =M 3 , the 
two polynomials in Equations (15.54b and d5.58b differ only in their polynomial 
coefficients. 

It was shown in lRiibsamen and Gershmari (12009 ) that the polynomial coeffi¬ 
cients (15.57b provide abetter approximation of the original MUSIC null spectrum 
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than does the MS technique. Therefore, it may be expected that FD Root-MUSIC 
shows improved DOA estimatio n performance as compared t o the M S technique. 
This conjecture was validated in Riibsamen and Gershman (12009 ) by computer 
simulations. 

The Fourier series coefficients F m (m = — (M 3 — 1),... ,M$ — 1) can be 
approximately obtained using the discrete Fourier transform (DFT) as 


17 ^ 

1 m — 


1 


2tx 


m 3 -i 

E 


/(/ A6)e~ jmlAe A 9 = F m 


(5.59) 


I=-(M 3 - 1 ) 


where /^0 = 2n / {2M^ — 1). With such a DFT approximation of f(z), the final 
expression for the FD Root-MUSIC polynomial can be written as 


m 3 -i 

f(z)~ F m z m =f-i{z) 

m=— (M 3 — 1) 

M 3 -l 

= J2 Fe jm0 =f:m 

m=— (M 3 — 1) 


(5.60) 


The DFT coefficients F m differ from the Fourier series coefficients F m because 
of aliasing introduced by sampling the MUSIC null-spectrum function/(0). The 
aliasing effect can be reduced by increasing M 3 , but this leads to a higher com¬ 
putational co mplexity, a detailed discussion of which is presented in Riibsamen 
and Gershman (120091) . 

The functions/i (0) and/ 2 ( 0 ) are non-negative by definition, which is not the 
case forThat \s,fy{6) may have some values that are slightly below zero 
in some of its minima. Because of the two sign changes of /3 (0), whenever (0) 
takes a negative value we obtain two roots that lie exactly on the unit circle very 

_ /V /V 

close to each other. Taking into account that Ff n =F_ m , it can be shown again 
that the roots of Equation (15.601) satisfy the conjugate reciprocity property. That 
is, if zo is a root, then 1/zq also nullifies fi(z). However, this property becomes 
trivial for roots that lie exactly on the unit circle and therefore do not appear in 
conjugate reciprocal pairs. 

Based o n this observation, a procedure different from Root-MUSIC was 
proposed in Riibsamen and Gershmanl ( 20091) to estimate the signal DO As from 


the 2 (M 3 — 1) roots of the polynomial (z). The signal roots of /3 (z) can be 
divided into two groups. The first group contains the roots that appear in pairs, 
lying exactly at the unit circle very close to each other. As discussed before, 
these root pairs correspond to negative minimum values ofand there are 
no conjugate reciprocal roots caused by roots of this type. The second group 
contains the roots that do not belong to the unit circle. These appear in conjugate 
reciprocal pairs and are the same as the roots of the standard Root-MUSIC 
polynomial. 
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A simple procedure to estimate the source DO As from the roots of /3 (z) can 
therefore be summarized as follows: 

Step 1. Take the root closest to the unit circle and determine if it belongs to 
the first or second group by checking whether its conjugate reciprocal value is 
another root. 

Step 2. If this root belongs to the first group, estimate the source DOA from 
the average of this root and its closest neighbor, and drop both. Go to step 4. 

Step 3. If this root belongs to the second group, use it to estimate the source 
DOA and then drop both this root and its corresponding conjugate reciprocal root. 

Step 4. If fewer than L DOAs have been estimated, return to step 1. 
Otherwise, stop. 

/V 

It is worth noting that the F m can be obtained in a computationally efficient 
way using the fast Fourier transform (FFT) algorithm. 


5.4.4 Fourier Domain-Weighted Least-Squares 
Root-MUSIC 


To estimate the signal DOAs, it is important to have a close approximation of the 
MUSIC null-spectrum function in the vicinity of the true DOAs; that is, in the 
areas where the null-spectrum fun ction has its minima. This idea was exploited in 
Riibsamen and Gershmanl ( 2009 ) to obtain the following weighted least-squares 


approximation of the MUSIC null-spectrum function: 


mm 

E H m } 


Q 


;=1 


m 4 -i 

fm - J2 H > 

m =— (M4 — 1) 


JmOi 


s.t. //* =//_ w V m = (),..., a / 4 — 1 


(5.61) 


where w(0f) are the weight coefficients, Q is the number of MUSIC null-spectrum 
samples, and 2 M 4 — 1 is the number of basis functions exp {jmO} that are used 
to approximate the null-spectrum samples. The weighted least-squares approxi¬ 
mation in Equation (15.611) requires that Q > 2 M 4 — 1. A natural choice of weight 
coefficients to stress low values of f(0) is given by 


1 

w(0i) =-, 

m) 



(5.62) 


The constraints in Equation (15.611) can be used to guarantee that the roots of the 
resulting FDWLS Root-MUSIC polynomial 


m 4 -i m 4 -i 

U(z)= J2 H *mZ m = J2 H* m ei me ±M6) (5.63) 

m =— (M4 — 1) ra=—(M4 — 1) 

obey the conjugate reciprocity property, where H* m (m = — (M 4 — 1),..., M 4 — 1) 
are the optimal coefficients that solve Equation (15.61b . Eliminating the equality 
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constraints in Equation (15.611) leads to the following equivalent unconstrained 
optimization problem: 

2 


Q / M 4 -l 

min £w(0 z ) \m)-H Q -2 M(H m e jme >) 

{Hmi 1=1 \ m =1 


and its solution is given by 


u = (B T DB) * B T Df 


where 


f = [f(e l ),...j(6 Q )] T 

u = [Ho, (Hm 4 - i) , 3 (Hi), 

D = diag{w(^i),..., w(0q)} 




B 


1q, 291 (Z?), —2%(B) 


(5.64) 


(5.65) 


(5.66) 

(5.67) 

(5.68) 

(5.69) 


[B]i m = e^ m ° l (1=1,... ,Q; m= l,..., M 4 ), \q is a Q x 1 vector of ones, and 
^(•) denotes the imaginary part of a complex value. 

Equation (15.651) determines the coefficients of the FDWLS Root-MUSIC 
polynomial (15.631) . The same procedure as in FD Roo t-MUSIC must be used to 
estima te the signal DO As from the polynomial’s roots. iRiibsamen and Gershman 
J2009 ) showed that for M 3 =M 4 the performace of FDWLS Root-MUSIC is 
substantially better than that of FD Root-MUSIC. 


5.5 SIMULATION RESULTS 

In our simulation examples, we focus on the search-free techniques for arbi¬ 
trary arrays and compare the DOA estimation performances of the interpolated 
Root-MUSIC, MS, FD Root-MUSIC, and FDWLS Root-MUSIC algorithms. As 
the LS method provides the smallest possible approximation errors, it has been 
used in the interpolated Root-MUSIC and MS techniques to compute the matri¬ 
ces G\ and G 2 , respectively. The interpolated Root-MUSIC technique has been 
applied to sectors of width 60°. Virtual arrays with the same number of sensors 
as the real array have been selected, and their apertures have been chosen to be 
orthogonal to the center directions of the sectors. Throughout our simulations, 
we assume M 2 =M 3 =M^=M, which guarantees that the asymptotic complex¬ 
ity of FD Root-MUSIC is no higher than that of the MS technique. The number 
of null-spectrum samples Q in FDWLS Root-MUSIC is twice that of Fourier 
basis functions. 

In our first example, a randomly generated (although fixed throughout all 
simulation runs) network user address (NUA) of N = 6 sensors is used. Its sensor 
locations are depicted in Figure 15.31 Two closely spaced equal-power signal 
sources are assumed to impinge on the array from the directions 0\ = 15° and 
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FIGURE 5.4 DOA estimation RMSEs versus SNR for K= 100, 6 \ = 15°, 62 =20°, andM = 19. 
The array geometry corresponds to Figure [531 

0 2 = 20°. The parameters K= 100 and M— 19 are taken in this example. All 
simulation results have been averaged over 1000 independent Monte Carlo runs 
and over the sources. 

Figure \5A \ displays the DOA estimation root mean-square errors (RMSEs) 
of the interpolated Root-MUSIC, MS, FD Root-MUSIC, and FDWLS Root- 
MUSIC techniques. The stochastic CRB is also shown. The DOA estimation 
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methods tested saturate in their performance at high signal-to-noise ratios 
(SNRs), due to the approximation errors in, Equation (15.501) for interpolated 
Root-MUSIC Equation (15.531) for MS, Equations (15.581) and (15.601) for FD Root- 
MUSIC, and Equation (15.611) for FDWLS Root-MUSIC. It can be observed that 
the performance saturation is less pronounced in the case of FD Root-MUSIC 
and FDWLS Root-MUSIC, which in particular substantially outperform the MS 
technique in the high SNR region. As expected, the performance of FDWLS 
Root-MUSIC is the best among these methods. 

In our second example, the RMSE performance of the same methods versus 
the angular separation between the sources is examined. To demonstrate that the 
search-free methods tested have an improved resolution threshold as compared 
to Spectral-MUSIC for closely spaced signals ( Rao and Hari . 1989b . we also add 
the Spectral-MUSIC performance curve in Figure l531 In this figure, the DO A of 
the first source is varied, whereas the DOA of the second source is fixed and equal 
to 62 = 20°. It has been assumed that SNR = 20 dB and all other parameters are 
as in the previous example. As FDWLS Root-MUSIC has essentially the same 
performance as FD Root-MUSIC for the chosen set of parameters, we only plot 
the curve for the latter. 

Figure 15.51 shows that the rooting-based methods have quite similar perfor¬ 
mance in the threshold domain, where they substantially outperform Spectral- 
MUSIC. Furthermore, FD Root-MUSIC outperforms MS in the asymptotic 
domain (i.e., in the case of widely spaced signals). 



FIGURE 5.5 DOA estimation RMSEs versus 62 — 9\ for K= 100, SNR = 20 dB, 62 = 20°, and 
M— 19. The array geometry corresponds to Figure [53] 
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In our third example, we extend the previous examples now considering the 
case when different (randomly generated) array geometries are used in each 
simulation run. In this example, the locations of N = 8 array sensors are drawn 
uniformly from the interior of a circle with radius X. The performances of the FD 
Root-MUSIC and MS techniques are compared for SNR = 20 dB and K = 100 
snapshots. The angular spacing between the two sources is assumed to be equal 
to 30°. In Figure 15761 a scatter plot for the DOA estimation RMSEs of the MS 
and FD Root-MUSIC techniques is shown for M = 15. Each point on this plot 
corresponds to one random realization of the array geometry. 

Figure l5l)l demonstrates the substantial performance improvements achieved 
by FD Root-MUSIC as compared to the MS technique. Note that the cases where 
FD Root-MUSIC outperforms the MS technique correspond to the points located 
above the diagonal line. The geometric mean of the RMSE improvement in this 
example is approximately given by a factor of 2.9. 

In the last example, we study the impact of the parameter M = M 2 =M^= M 4 
on the performance of the methods tested. We consider a random array geometry 
drawn in each run uniformly from the interior of a circle of radius 1.25A. The 
other array and source parameters are chosen as before. In Figure [5771 the DOA 
estimation RMSEs are plotted versus the value of M. All curves in this figure have 



1(T 2 KT 1 10° 10 1 

RMSE (degrees), FD Root-MUSIC 


FIGURE 5.6 Scatter plot of the RMSEs of the MS and FD Root-MUSIC techniques for N = 8, 
K = 100, SNR = 20 dB, 62 — 6 \ = 30°, and M = 15. The array geometry is randomly generated in 
each simulation run. 
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5.6 Conclusion 



M = M 2 = M 3 = M a 

FIGURE 5.7 DOA estimation RMSEs versus M for K= 100, SNR = 20 dB, and 62 — 0\ =30°. 
The array geometry is randomly generated in each simulation run. 


been averaged over 1000 realizations of the random array geometry. Clearly, FD 
and FDWLS Root-MUSIC require reduced polynomial orders as compared to 
the MS technique to achieve the same performance. 


5.6 CONCLUSION 

An overview of search-free DOA estimation methods was presented with a 
particular emphasis on nonuniform sensor arrays. First, the traditional Root- 
MUSIC and ESPRIT techniques were discussed. Then, more recent search-free 
direction-finding methods were introduced that extend the classic Root-MUSIC 
and ESPRIT approaches to more general classes of nonuniform arrays. In the 
final sections of the chapter, we discussed search-free approaches that are appli¬ 
cable to arbitrary array geometries. Simulation results were presented to compare 
the performance of the latter techniques. 

Our results demonstrate that the family of Fourier Domain Root-MUSIC 
techniques represents a computationally attractive alternative to existing search- 
free DOA estimation methods for arbitrary nonuniform arrays. 
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Spatial Time-Frequency 
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Estimation 


Moeness Amin, Yimin Zhang 


6.1 INTRODUCTION 


This chapter discusses the direction-of-arrival (DOA) estimation of far-held 
sources producing nonstationary signals, particularly those in the form of 
frequency-modulated (FM) waveforms, using a class of methods based on the 
quadratic (bilinear) spatial time-frequency distribution (STFD) framework. Non¬ 
stationary signals are encountered in various applications in communications, 
radar systems, and biomedicine. For example, many modern radar systems use 
linear FM (LFM, or “chirp”) signals to achieve pulse compression. Maneuvering 
or rotation due to targets or surrounding environments may generate ti me-varying 
Dopp l er frequency with well -defined Doppler frequency signatures JBoashash . 
2003 : Qian and Chenl. Il996l) . FM signals are also among the popularly used 

arces for blocking communications in a wide-frequency 


intentiona 


I am me r so 


spectrum dAminlll997l) . 


For many decades, time-frequency (t-f) signal representations, such as the 
Wigner-Ville distribution (WVD) and spectrogram, were only applied to analyze 
nonstationary signals incident on single-sensor receivers. The objective was to 
characterize signals in the t-f domain, leading to proper signal characterization, 
separation, classification, and cancellation. These offerings were subsequently 
enhanced by new quadratic time-frequency distributions (TFDs), which have 
led to improved multicomponent signal power localizations in the t-f domain. 
As many applications of signal processing in the areas of communication, radar, 
acoustics, and biomedicine call for the use of multisensor receivers, it has become 


important to consider TFDs in 

the context of arrav processing. Thi 

s is achieved 

within the framework of STFD ( 

Belouchrani and Amin. 1998. 

1999; 

Zhang et al.. 


20011 ) . 
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The STFD framework was first developed bv lBelouchrani and Amin ( 1998 1 
for the blind separation of narrowband nonstationary signals. It was shown that 
the STFD matrix is related to the source TFD matrix by the spatial mixing 
matrix in a manner similar to the commonly used formula in narrowband 
array-processing problems that use second-order statistics to relate the sensor 
spatial covariance and source covariance matrices. The framework was then 
used for DOA estimations of nonstationary signals using t-f -based DOA estima- 

ihood 


(t-f ML), and t-f ESP 

RIT (A 

min and Zhang. 

2000|; I 

3elouchrani and Amin 

,1999; 

Gershman and Amin. 

200(1 

-fassanien et al. 

[2002 

Sekihara et al. 

1999 

L Wang 

and Wu. 12002: 

Zhang et al 

. 2000|). These techniques have shown improved 


performance compared to their conventional DOA counterparts. 

Thorough supporting analyses of the advantages of STFD-based DOA esti- 


:he signal and noise subspaces estimated 
Zhang et al. ( 200ll) . It was shown that, 


mation techniques and the robustness of 
from STFD matrices were provided by 
by constructing an STFD matrix from the selected t-f points of highly localized 
signal energy, the corresponding signal and noise subspace estimates become 
more robust to noise than their counterparts obtained using the data covariance 
matrix as a result of signal-to-noise ratio (SNR) enhancement. In addition, source 
masking and filtering in the t-f domain allow separation and thus consideration 
of individual sources or subsets of sources in the field of view. Source elimina¬ 
tion, rendered through the selection of specific t-f regions corresponding only 
to a subset of signal arrivals, further increases the SNR and reduces mutual 
interference between signals, yielding improved subspace robustness. With such 
discriminatory capability, the receiver can process more signals than sensors 
and provide improved DOA estimation. In some applications, particularly when 
acoustic signals are involved, the TFD of multiple signals can be approximately 
disjoint or orthogonal; that is, only one signal is active in the t-f plane at a given 
t-f poin t. Therefore, the signals’ DO As can be estimated using only two receive 
sensors jRickard and Dietrichl 2000l) . Accordingly, DOA estimation for nonsta¬ 
tionary signals that are well localized in the t-f domain or characterized by their 
instantaneous frequencies (IFs) should be performed using t-f methods. 

The merits of t-f DOA estimation can only be materialized through the selec¬ 
tion of appropriate t-f points in the construction of STFD matrices. While in some 
scenarios the selection of peak t-f points may be relatively easy, the problem may 
become more challenging in other situations—for example, when the signals are 
highly contaminated by noise. Utilization of the spatial diversity inherent in the 
STFD matrix can enhance the t-f signature of the signals of intere st. A basic 


exam ple is simple averaging of the TFDs over all the receive sensors (IMu et al 


20031) . Analysis of TF Ds across different sensors also helps identify auto -term 
and cross-term points ( Linh-Trung et all 2005 : Zhang and Amin . 2006 ). The 
separation of the auto-term t-f points from cross-term t-f points in general is 
a less-critical issue in DOA estimation when compared to blind-source sepa¬ 
ration applications, because in performing DOA estimation the STFD matrices 
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only need to meet the full-rank requirement ( Amin and Zhang . 20001) . Neverthe¬ 
less, the capability of separating auto-term from cross-term points often helps 
the selection of appropriate t-f regions corresponding to a subset of sources for 
source discrimination. 

This chapter also reviews relevant advances in different aspects of STFD. 
One is the augmentation of the polarization dimension, yielding spatial polari- 
metric time-frequency distributions (SPTFD). SPTFD methods include polari- 
metric time-frequency MUSIC (PTF-MUSIC) and polarimetric time-frequenc^ 


metric time-rrequency muml (r i ana poianmetric time-irequency 

ESPRI T (PTF-ESPRIT) JMu and Aminll200d:l0beidat et allEooilZhang et all 
2006al) . which take advantage of the distinction of signal signatures between 
the sources in both the t-f and polarization domains for the improvement of 
DO A estimation performance. In this way they outperform their respective 
counterparts that only use source diversity in terms of t-f or polarization. 

The advantage of using polarization dimensionality becomes particularly 
evident when the sources have distinct polarizations but are closely spaced and 
have similar waveforms. Application of PTF-MUSIC for t he tracking of movin g 
sources with time-varying polarizations was documented in Zhang et al.l ( 2QQ6bh . 
The exploitation of spatial joint variable distributions (SJVDs), such as the spatial 
ambiguity function (SAF) that employs the Doppler and lag variables, permits 
estimation of DO A information for nonstation ary signals that can be represented 
using their ambiguity domain characteristics ( Amin et al. . 1200(1 

Another important extension of STFD and SJVD DOA methods is the con¬ 
sideration of wideband signals where the narrowband assumption doe s not hold 
( Gershman and Amin . 2Q0QI : Ma and Goh . 2006 : Obeidat et al. . 2004 ). 


Note that this chapter focuses on the DOA estimation approaches of nonsta¬ 
tionary signals based on the quadratic STFD framework. In addition to quadratic 
TFDs, a nonstationary signal can also be described using linear transforms, such 
as the short-time Fourier transform (STFT) and the wavelet transform. Accord¬ 
ingly, DOA estimation can be performed based on linear t-f representations, 
providing that power localization and enhanced SNR are achieved. However, 
the STFT has a well-known shortcoming of trading off the resolutions in the 
time and frequency domains. Multiresolution analysis, on the other hand, is not 
most effective when dealing with signals characterized by their IF laws. 


6.2 TIME-FREQUENCY DISTRIBUTION 

Cohen’sl(ll989lll99.5h class of TFDs of a narrowband signal x(t) is defined as 


D xx (tJ) = JJ <p(t — u, t)x(u+ v* (u— e dudx (6.1) 


where t and/ represent the time and frequency indexes, respectively, f(t, x) is 
the t-f kernel, r is the time-lag variable, and * denotes complex conjugation. In 
this chapter, all the integrals are from — oo to oo. 
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The cross-term TFD of two signals Xiit) and Xk(t ) is defined by 


DxiXk (LjO — 


JJ (p{t — u, x)xi^u-\- -^jx^^u — -^e j 27T f T dudz (6.2) 


In practice, TFDs are often evaluated using their discrete-time forms ( Boashash 
and Put! and. 120031) . To use integer time delay r, we rewrite Equation (16.11) as 


D xx (t,f) = 2 If p(t — u,2t)x(u-\-t)x*(u — x)e ■ /477 ^ T dudr (6.3) 


The discrete form of the auto-term TFD corresponding to Equation (16.31) is often 
expressed as 


oo oo 

Dxx(t,f)= ^2 ^2 pit-u,x)xiu J rX)x"iu-x)e~^ T (6.4) 

u=—oo r=—oo 


which excludes the constant of two and a scaling factor in r for expressional 
convenience. Similarly, the cross-term TFD corresponding to Equation (16.21) is 
expressed as 


oo oo 

Dx iXk (t,f)= ^3 13 (p(t-u,T)xi(u + T)xl(u-T)e~ j4nfr (6.5) 

U = — OO r= — OO 


The TFD maps one-dimensional (ID) signals in the time domain into two- 
dimensional (2D) signal representations in the t-f domain. The TFD property 
of concentrating the input signal energy around its IF while spreading the noise 
energy over the entire t-f domain increases the effective SNR and proves valuable 
in DOA estimation. For a single-component LFM signal, pseudo-Wigner-Ville 
distribution (PWVD ) can achieve SNR improvement up to the window length 
(IZhang et all 1 200 11) . This enhancement i s dominantly determin ed by window 
size, but is less sensitive to kernel tvpe dMu and Aminl l2000f). When all t-f 
points are selected within a 3-dB band width from the peaks , SNR improvement 
remains proportional to window length (IXia and ChenLI 1999b . Such observations 
are valid for a general class of signals provided that the third-order derivative 
of the waveform phase is negligible or, equivalently, that the waveforms can be 
approximated by an LFM within each sliding-window interval. 

The properties of a TFD can be characterized by simple constraints on the 
kernel. Different kernels can be used to generate TFDs with prescribed, desirable 
properties. WVD is often regarded as the basic or prototype quadratic TFD, 
because the other quadratic TFDs can be described as filtered versions of it. 
WVD is known to provide the best t-f resolution for single-component LFM 
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TABLE 6.1 Time-Frequency Kernels 

Distribution 

Kernel r) 

Wigner-Ville 

<5(0 

Pseudo-Wigner-Ville 

S(t)w(r) 

Choi-Williams 

ex p(— crt 2 lx 2 ) 

Zhao-Atlas-Marks 

w(r)rect ( 2 r/a ) 



signals, but it yields high cross-terms when the frequency law is not linear or 
when multicomponent signals are considered. 

Various reduced-interference kernels have been developed to lower cross¬ 
term interference . Table 16.11 shows some commonly used kernel functions 
( Boashashl . 120031) . where 8(t ) is a Dirac delta function, rect(t) is a rectangular 
window function, w(r) is an arbitrary window function, and a and a are scalars. 
TFD examples that use these kernels are illustrated in Examples 16.11 and1621 


Example 6.1 

Figures !6.1T a) and 16.1T b) show the real and imaginary parts of an analytic 
LFM signal, respectively, expressed as x(t) = exp[j27r(0 .It + 0.001176t 2 )], t = 
0,..., 255. The start and end frequencies of the LFM signals are, respectively, 
0.1 and 0.4. Fourier transform of the signal yields a spectrum spreading over the 
normalized frequency band [0.1, 0.4], as shown in Figure l6JT c). The WVD of 
the waveform, shown in Figure l6JT d). depicts high energy concentration of the 
instantaneous narrowband signal with a linearly time-varying IF signature. 


Example 6.2 

Figure [6]2] shows the t-f representations of two time-limited parallel LFM sig¬ 
nals using different kernels. The WVD provides sharp auto-term signatures, 
whereas the cross-terms are evidently present in the middle of the two auto-term 
signatures. With PWVD, the cross-terms in the time domain are mitigated, 
whereas those in the frequency domain remain. Both the Choi-Williams dis¬ 
tribution and the Zhao-Atlas-Marks distribution provide much reduced cross¬ 
term presence. The three non-WVD distributions yield much wider auto-term 
signatures in the t-f domain. 
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Frequency Time 

(c) (d) 

FIGURE 6.1 Waveform and Wigner-Ville distribution of an LFM signal: (a) real part of waveform; 
(b) imaginary part of waveform; (c) FFT magnitude; and (d) Wigner-Ville distribution. 


6.3 SPATIAL TIME-FREQUENCY DISTRIBUTION 

This section introduces the concept of STFD and then analyzes the robustness 
of the subspace estimated from STFD matrices. The fact that STFD matrices 
provide higher robustness compared to the corresponding covariance matrices is 
the foundation of extending subspace-based DOA estimation techniques in the 
STFD platform for improved DOA estimation. As examples, we introduce the t-f 
MUSIC and t-f ML techniques and demonstrate their performance advantages 
over conventional DOA estimation techniques. 

6.3.1 Definitions 

Consider n narrowband nonstationary signals impinging on an array consisting 
of m single-polarized sensors. For simplicity, we assume a ID DOA estima¬ 
tion problem (e.g., only the azimuth angle is considered), but the extension to 
a 2D problem (i.e., both the azimuth and elevation angles are considered) is 





















































































































































































































































Frequency Frequency 


6.' Spatial Time-Frequency Distribution 



) 




(c) (d) 

FIGURE 6.2 TFDs of two LFM signals corresponding to different kernels: (a) Wigner-Ville distribution; 
(b) pseudo-Wigner-Ville distribution; (c) Choi-Williams distribution; and (d) Zhao-Atlas-Marks distribution. 


straightforward. The m x 1 received data vector x(t) and the n x 1 source signal 
vector s (t) are related by 


x(O = A(0)s(O + n(O (6.6) 

where the m x n matrix A(0) = [a($i), a ($ 2 ), • • •, a(0 w )] is the mixing matrix 
that holds the steering vectors of the n signals, 0 = [ 0 \ , 62 , ..., 0 n ], and a( 0 q ) is 
the steering vector for the qth source, s q (t), that arrives from direction 0 q . Each 

r 1 ' 

element of s(t) = [s\(t), S 2 (t),..., s n (t )] is assumed to be a mono-component 
signal, where the superscript T denotes the transpose of a matrix or a vector. 

Because of the signal mixing occurring at each sensor, the elements of x(t) 
become multicomponent signals. n(t) is an m x 1 additive noise vector that con¬ 
sists of independent and identically distributed (i.i.d.) zero-mean, white and 
complex Gaussian distributed processes with variance a 1 1, where I denotes any 
identity matrix of a proper dimension. The noise elements are assumed to be 
independent of the signals, which are assumed to be deterministic. 
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The STFD matrix of vector x(t) (IBelouchrani and Aminl . ll 99 8i) is expressed as 


Dxx(*,/) = JJ <P(t — u, t)x(u-\- ^Jx h (u— e ^ 2jT ^ T dudr (6.7) 


where the superscript H denotes conjugate transpose (Hermitian) operation, and 
the (/, k)-th element of D XX (A/) is given in Equation (16.2b for i, k = 1,2,..., m. 
The noise-free STFD matrix is obtained by substituting Equation (16.6b into 
Equation d6.7b . resulting in 

Dxx A(0)D ss (/,/) (0) (6.8) 


where D ss (t,/) is the TFD matrix of s (t), which consists of auto-source TFDs 
as the diagonal elements and cross-source TFDs as the off-diagonal elements. In 
the presence of noise, the expected value of D xx (t,/) becomes 

£[Dxx(f,/)] = A(0)D ss (f,/)A H (0) +a 2 1 (6.9) 


where E[-] denotes statistical expectation. 

Equation (16.9b relates the STFD matrix to the source TFD matrix in a manner 
similar to the formula that is commonly used in narrowband array-processing 
problems to relate the source covariance and sensor spatial covariance matrices. 
It is clear, therefore, that the two subspaces spanned by the principle eigenvec¬ 
tors of Dxx (r,/) and the columns of A(0) are identical. As will be discussed, the 
construction of the STFD matrix from the t-f points of highly localized signal 
energy allows the corresponding signal and noise subspace estimates to become 



6.3.2 Utilization of Multiple Time-Frequency Points 

Let and G^, respectively, span the signal and noise subspaces corresponding 
to a t-f region. A simple way to estimate and is through eigendecomposi- 
tion on a single STFD matrix D xx (t,f) obtained at a t-f point. However, subspace 
estimation from an STFD matrix is usually not robust and may have a singularity 
problem. Multiple t-f points that share the same spatial signature can be used to 
avoid such a problem. Joint block diagonalization (JBD 
K t-f points, D xx (ti,fj ), i = 1,..., K, was introduced in 


of the combined set of 


Belouchrani and Amin 


(Il999h . The JBD is achieved by maximizatio n, under unitary transform, of the 


following criterion (IBelouchrani et al. 


1997h : 


C(E^)= 


K m 



|efD 


XX 


(■ tk,fk)ei 


i,l 


( 6 . 10 ) 
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over the set of unitary matrices, where = [S^ G^], and e* is the ith column of 
E^. In Belouchrani et alJ ( 1997 ). an efficient Jacobi-like algorithm for solving 
Equation (16.101) was presented. 

Alternatively, STFD matrices evaluated at multiple t-f points can be incor¬ 
porat ed by eigendecomp osition of an STFD matrix to be obtained by averaging 
them ( Zhang et al. . 200 lh . This approach is simpler than JBD, particularly when 
the number of t-f points is large. 


6.3.3 SNR Enhancement and Source Discrimination 

To understand the properties of STFDs, we consider PWVD applied to FM 
signals. This consideration is motivated by the fact that these FM signals have 
constant magnitude and clear t-f signatures characterized by their IFs. They can 
be modeled as 


s(/) = [Slit),s n (t)] T = .. .,D n eJ M) ] (6.11) 

where D q and ir q (t) are the fixed amplitude and time-varying phase of the qth 
source signal. For each sampling time t, the IF of d q (t ) is expressed as f q (t) = 
d\ls q (t)/(2ndt). 

Using the PWVD, we construct the spatial pseudo-Wigner-Ville distribution 
(SPWVD) matrix, which, for a vector signal x(t), is expressed in discrete 
format as 


( L — 1)/2 

D xx(?,/)= F x(t + x)x*(t-T)e-Mr (6.12) 

r=—(L—1)/2 

where the window size L is chosen as an odd integer. An STFD matrix can be 
constructed by averaging D XX (T/) at the peak auto-term t-f points of n 0 selected 
FM signals: 




n Q N' 

X^ ^ xx (** 


q= 1 /=! 


(6.13) 


where f qq {ti) is the IF law of the gth signal at the ith time sample, and 
N' = N — L + 1 is the number of t-f points that are not computed from padded 
zeros, with N denoting the total number of observation samples. We assume 
N^n and N^>m. In the case where the IF law of the FM signals is approximated 
by a linear behavior over each kernel window (i.e., the third- and higher- 
order derivatives of the phase of the FM signals are negligible over the time 
period \t — L+ 1, t-\-L— 1] for all values of t ), the expectation of D is given 
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bv IZhang et al.l (120011) as 


D = E 


D 


1 


n< 


N' 


n 0 N' 



XX 


( Uifq,i (fi))] 


<7=1 i=l 


L 


(6.14) 


= — A(® 0 )R 0 ss A(e 0 )+a 2 l 


where R^ s = diag \D 2 { , ...,] and A(0°) = [a(0O,...,a ( 0 Ro )], respectively, 
represent the signal covariance matrix and the mixing matrix formulated by only 
considering the n 0 signals selected from the total number of n signal arrivals. 

It becomes clear from the preceding expression that the expected STFD 
matrix is similar to a covariance matrix constructed as if only the n 0 selected 
sources are present, and the SNR is enhanced by a factor of G = L/n 0 , which is 
proportional to the window length L and inversely proportional to the number 
of sources contributing to matrix D. Therefore, when Ly>n 0 is satisfied, DOA 
estimation of the selected n 0 sources can be achieved based on the expected 
STFD matrix D with improved performance. 

With the use of STFD matrices, array processing can be performed for a 
subclass of the impinging signals with specific t-f signatures. In this respect, t-f 
DOA estimation techniques have implicit spatial filtering, removing the unde¬ 
sired signals from consideration. From the SNR perspective, it is desirable to 
choose a small subset of signals. Ideally, it is best to set n 0 — 1. That is, select the 
sets of N't-f points that belong to only a single signal at a time. It is also important 
to note that, with the ability to construct the STFD matrix from one or few signal 
arrivals, the well-known m > n condition on DOA estimation using arrays can be 
relaxed to m > n Q . In other words, we can perform DOA estimation with fewer 
array sensors than the total number of impinging signals. Further, from the angu¬ 
lar resolution perspective, closely spaced sources with different t-f signatures 
can be resolved by constructin g two separate STFDs, each c orresponding to one 
source. It was demonstrated in Rickard and Dietrichl ( 2QQ0b that when the TFD 
of multiple signals can be considered approximately disjoint or orthogonal (i.e., 
only one signal is active in the t-f plane at a given t-f point), the signals’ DO As 
can be estimated using only two receive sensors. The drawback to using STFD 
matrices corresponding to a subclass of signal arrivals is of course the need for 
repeated computations if the DOA information on all source signals is of interest. 


6.3.4 Subspace Analysis for FM Signals 

To analyze the subspace robustness, we first present the case of FM signals using 
the conventional covariance matrix approach. In this case, it is required that the 
number of sensors be greater than the number of sources (i.e., m>n ) and that 
matrix A(0) be full column rank. Then the covariance matrix of x(t) is given by 

R xx =£[x(Ox // (O]=A(0)R ss A(0) + a 2 I (6.15) 

where K ss = E[s(t)s H (t)] is the source covariance matrix. 

















6 . ■ Spatial Time-Frequency Distribution 


Let A i > A .2 > ... > X n > X n +1 = ^n +2 =... = k m = a 1 denote the eigenvalues 
of R xx . The unit-norm eigenvectors associated with X\,...,X n constitute the 
columns of the signal subspace matrix S= [si,..., s w ], and those correspon¬ 
ding to ,..., X m make up the noise subspace matrix G = [gi,..., g m -n\- 
When R xx is estimated from the available data samples x(/), /= 1,2,..., N, 

A -t ^ _ A7 tj 

the estimated covariance matrix is given by R xx = ^ e;=i x(i)x M (i). Let 

{si,..., s w , gi,..., g m-n) denote the unit-norm eigenvectors of R xx , arranged 
in descending order of associated eigenvalues, and denote S= [§i,...,s w ] and 

G= [gi,..., g m -n\ For mutually uncorrelated signals, the orthogonal projec¬ 
tions of g i onto the column space of S are asymptotically (for la rge N) jointly 


Gaus sian distributed with zero means and covariance matrices (IZhang et al. 
2001 ) given by 


(SS H i/) (SS"gf) 


Hr xH 


<7 

aT 


n 




V- 

itl F 2 - h) 


2 S k s k 


Si/= 2u Sij 


(6.16) 


(SS H gi) (SS H gi) 


= 0, for all ij 


(6.17) 


where 8(j is the Kronecker delta function. Both equations are identic al to those 
derived for independent stochastic signals ( 


Stoica and Nehoralll989t) 


Now we consider the subspace of STFD matrix D and its estimate D. The 
results are compared with those obtained from R xx = A°R dd (A° ) H + a 2 I and its 

estimate R xx . Let > ... > > A 77q+1 = Ki^+i — • • • — Kn = a2 denote the 

eigenvalues of R xx , and let its signal subspace and noise subspace be represented, 
respectively, by S° = [sj, ..., s° n ] and G° = [gj,..., g^_ w ]. We also denote 

= A^ q+1 = ... = A !h = (cr^) 2 as the eigenvalues of D, 

tf tf 

’ • • • ’ £m—n 0 

and noise subspaces. Then 

• The signal and noise subspaces of and are respectively the same as 
those of and G°. 

• The eigenvalues have the following relationship: 


> ^2 > 

and = 


Jf 

>i ’ 


> x tf > x tf 

> > % 0+ 1 


tf 

, s L 


and G^ = 


as the corresponding signal 


4 = 


L(X°-ol)+ol=±-X°+(\-4)o\ i < n 

K) 2 = <t 2 , 


(6.18) 


n 0 <i<m 


That is, the largest n 0 eigenvalues are amplified by a factor of L/n 0 when the 
STFD matrix is used. 


The robustness of the noise subspace ob 


samples is described as follows dZhang et al. 


ained from a finite number of data 


200 11) . If the third- and higher-order 


derivatives of the phase of the FM signals are negligible over the time period 

\t — L + 1, t + L — 1] for all values of t , then the orthogonal projections of {g f } 
onto the column space of are asymptotically (for N^L) jointly Gaussian 
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distributed, with zero means and covariance matrices given by 

H 


S ,f (S ,f ) H g ;f ) (s J/ ( S ,f fgpj 


(6.19) 


a l L 
no.N 11 


n. 


E 


A 


tf 


k = 1 ( a2 - A f) 


tf 
2 S k 


(■0 


// 


5; 

1 v / 1 a 


s ?f ( S !f (s ;/ (s ;/ ) u gf) 


= 0, for all ij 


( 6 . 20 ) 


Two important observations are in order. First, if the signals are both local- 
izable and separable in the t-f domain, reducing the number of signals from n to 
n 0 greatly reduces the estimation error, specifically when the signals are closely 
spaced. Second, regarding SNR enhancements, the preceding equations show 
that error reductions using STFDs are more pronounced for the cases of low 
SNR and/or closely spaced signals. When X° k ^>a 2 for all k = 1,2,..., n 0 , the 
results are almost independent of L (assume N L so that N' = N — L+l&N), 
and therefore there is no obvious improvement in using the STFD over con¬ 
ventional array processing. On the other hand, when some of the eigenvalues 
are close to a 1 (k° k ~ a 2 for some k=l,2, ...,n 0 ), which is the case for weak or 
closely spaced signals, the result of Equation (16.191) is reduced by a factor of up 
to G = L/n 0 . This factor represents, in essence, the gain achieved from STFD 
processing. 

6.4 TIME-FREQUENCY DOA ESTIMATION TECHNIQUES 

As we addressed earlier, various DOA estimation techniques can be applied in 
the STFD framework. In this section, we introduce time-frequency MUSIC and 
time-frequency maximum likelihood techniques as examples. 


6.4.1 Time-Frequency MUSIC 

We first recall that the DO As are estimated in the MUSIC technique by deter¬ 
mining the n values of 6 for which the following spatial pse udo-power spect rum 
(abbreviated as “spatial spectrum” hereafter) is maximized ( Schmidt . Il98tih . 


/mu(0) = 


sl h (0)GG h sl (0) 


-i—i 


a"(6>)(l —SS H )a(<9) 


-l — l 


( 6 . 21 ) 


where a (6) is the steering vect or corresponding to 0. Th e variance of the esti¬ 
mates in the MUSIC technique ( StoicaandNehorai . 1989b . assuming white noise 
processes, is given by 


1 a H (OdUam 


E(coi-coi) = — x 


h(9i) 


2 N 


( 6 . 22 ) 
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where cot is the spatial frequency associated with DOA Oi, and 00 i is its estimate 
obtained from MUSIC. The spatial frequency is defined as cot = 2jTds ^ e 0 f or a 
uniform linear array (ULA), where X is the wavelength of the impinging wave- 
front, and d is the inter-element spacing between two adjacent sensors. Moreover, 
U is defined in Equation (16.161) . and 

h( 6 j) =d H (0j) GG H d( 6 j), with d(6>,-) = da( 6 i) /dco (6.23) 


The t-f MUSIC algorithm is very similar to MUSIC. With n 0 signals selected, 
the DOAs are determined by locating the n 0 peaks of the spatial spectrum defined 
from the TFD of the n 0 selected signals: 


/mu (0) — 


a H (9)G ,f (G tf Y a( 6 ) 


1-1 


a H ( 0 )\ I- 





H' 


-i-l 


(6.24) 


m 


The variance of the DOA estimates based on t-f MUSIC (IZhang et alll2001h is 
obtained as 


tf \ 2 1 a H (0iWam 




IN 




(6.25) 


where cof is its estimate of coj obtained from t-f MUSIC, LC is defined in 
Equation (16.191) . and 


h? (d i )=d H (d i )G t f(G‘^ dm 

Note that hP (i9/) = h (Oi) if n 0 = n. 


(6.26) 


Example 6.3 

Consider a ULA consisting of eight sensors spaced by a half-wavelength. Two 
LFM signals are emitted by two sources from DOAs of 0\ = —10° and 62 = 10°, 
respectively. The start and end frequencies of the first source are/u =0 and 
fei =0.5, whereas the corresponding two frequencies for the other source are 
f S 2 = 0.5 and /^2 = 0. An observation period of 1024 samples is assumed. 

Figure l631 displavs the root mean-square error (RMSE) of the estimated DOA 

/V 

0\ versus the input SNR. The curves in the figure show the theoretical and experi¬ 
mental results of conventional MUSIC and t-f MUSIC (for two cases of L = 33 
and L = 129). The Cramer-Rao bound (CRB) is also shown for comparison. 

Both signals were selected when performing t-f MUSIC (n 0 = n = 2). Simu¬ 
lation results were averaged over 100 independent trials. The advantages of t-f 
MUSIC in low-SNR cases are evident. The experimental results deviate from 
the theoretical results for low SNR, since we only considered the lowest order of 
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FIGURE 6.3 RMSE performance of DOA estimates using t-f MUSIC and MUSIC versus 
input SNR. 
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(b) 

FIGURE 6.4 Estimated spatial spectra using (a) t-f MUSIC and (b) MUSIC. 


the coefficients of the perturbation expansion in deriving the latter (IZhang et al. 


20011) . Figure 1641 shows estimated spatial spectra at SNR=—20dB based on 


t-f MUSIC (L = 129) and conventional MUSIC. The t-f MUSIC spectral peaks 
are clearly resolved in all trials. 
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Example 6.4 

Figure l631 shows examples of the estimated spatial spectrum based on t-f MUSIC 
and conventional MUSIC where the angle separation is small (0\ = —2°, 62 = 
2°). Lis chosen to be 129 and the input SNR is —5 dB.Two t-f MUSIC algorithms 
are performed using two sets of t-f points; each set belongs to the t-f signature 
of one source (n 0 = 1). It is evident that the two signals cannot be resolved when 
conventional MUSIC is applied. However, by using their distinct t-f signatures 
and applying t-f MUSIC separately for each signal, the two signals become 
clearly separated and moderate DOA estimation performance is achieved. It is 
noted that there is a small bias in the estimates of t-f MUSIC due to the sidelobe 
leakage of one signal’s TFD when the other signal is selected. 



9 (degrees) 


(a) 



(b) 


FIGURE 6.5 Estimated spatial spectra using (a) t-f MUSIC and (b) MUSIC for closely spaced 
signals. 


6.4.2 Time-Frequency Maximum Likelihood 

Compared to t-f MUSIC, t-f maximum likelihood (ML) methods hav e advan¬ 
tages in dealing with coherent nonstationary sources ( Zhang et al. . 2QQ0b . Before 
we i ntroduce t-f ML. we first review conventional ML methods ( Ziskind and 
Wax. ll988l) . 

When the noise elements are assumed to be i.i.d. complex Gaussian vari¬ 
ables with zero means, the joint density function of the sampled data vectors 
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xi, X 2 ,..., xjv is given by 


*1/1 

/(x(l),... ,x(A0) = Y\ m 2m ex P(- 2 [X(0-Ad(0]" |x(/)-Ad(;)] 

J L O \ O 

1=1 X 


(6.27) 


The concentrated likelihood function (IZiskind and Waxl.ll988i) is obtained as 


^ML(®)=tr 


i— a(a // a) 1 


H 


R 


XX 


(6.28) 


where tr(.) denotes the trace of a matrix, and A is the estimate of A. The ML 
estimate of © is obtained as the minimizer of this expression. Let cot and <£);, 
respectively, denote the spatial frequency and its ML estimate associated with 0j. 
The estimation error ( oo i — coi ) is thus asymptotically (for large N ) jointly Gaus¬ 
sian distrib uted with zero means and the following covariance matrix f Stoica 
and Nehorai, 1989b : 


/r[(w,- w ,) 2 ] = -L[Rc(HoR^)] ' Rc[Hc(R dd A ,, UAR dd ) 7 ] [Rc(HoR^)] 1 

(6.29) 


where o denotes the Hadamard product, Re(.) denotes real part, and 


H = C 


H 


I-A(A^A) ‘a 


-1 


H 


C, with C = dA/dco 


(6.30) 


Now we consider the t-f ML method. As discussed in the previous section, 
we select n 0 signals in the t-f domain. The concentrated likelihood function 

/V /V 

defined from the STFD matrix is obtained by replacing R xx by D: 


^M L ( 0 )= tr 


I — A 



D 


(6.31) 


where A° is the estimate of A°. Therefore, the estimation error 




ciated with the t-f ML method is asymptotically (for N^>L) joint 


distributed with zero means and the covariance matrix dZhang et al. 


— alij asso- 
y Gau ssian 

2000h 


CO -COi 


) 


o 


[Re(H° oDj d )] _1 Re 

1-1 


IN 
a 2 - 


2N'l 


Re(H"o(R; w ) 7 ) 


n° 


Re 


(D dd (A 0 ) H U^A 0 D dd ) 


T \T-1 


H 


o 


(R;; d (A") w u"A"R;; <l ) 


[Re(H" oD ( | ( |)] 

T 


Re H 


o 


R dd) r )] _1 

(6.32) 
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where 


H° = (C+ i-a+a+a°) /a 0 ) 


H 


C°, with C° = d\°/dco (6.33) 


In the case of n 0 = n, H° = H, and C° = C. 

The signal localization in the t-f domain enables us to select fewer signal 
arrivals. This fact not only is important in improving estimation performance, 
particularly when the signals are closely spaced, but it also reduces the dimension 
of the optimization problem solved by the maximum likelihood algorithm, and 
subsequently reduces the computational requirement. 


Example 6.5 

To demonstrate the advantages of t-f ML over conventional ML, consider a ULA 
consisting of eight sensors separated by a half-wavelength. Two FM signals arrive 
from 0\ = —10° and 62 = 10° with the IFs/i(t) = 0.2 + 0.D/A + 0.2sin(27rt/A) 
and/ 2 (0 = 0.2 + 0.D/A + 0.2sin(27n L /A + 7r/2), t = 1,..., A. The input SNR of 
both signals is — 20 dB, and the number of snapshots used in the simulation is 
A = 1024. L= 129 is used for t-f ML. Figure IC6l shows the (6\, 62 ) pairs that 
minimize the respective likelihood function of the t-f ML and ML methods for 
20 independent trials. It is evident that t-f ML provides much improved DOA 
estimation over conventional ML. 


Example 6.6 

In this example, t-f ML and t-f MUSIC are compared for coherent sources. The 
ULA considered in Example 16.51 is also used here. The two coherent FM signals 
now have a common IF oifi^it) = 0.2 + 0.D/A + 0.2sin(277t/A), t = 1,..., A, 
with a 7r/2 phase difference. The signals arrive at 0\ = —2° and 62 = 2 °. The input 
SNR of both signals is 5 dB and the number of snapshots is A = 1024. Figure lCTl 
shows the (0\, 62 ) estimates of t-f ML and the estimated spatial spectra of t-f 
MUSIC for five independent trials. It is clear that t-f ML can separate the two 
signals, whereas t-f MUSIC cannot. 


6.5 POLARIMETRIC TIME-FREQUENCY DOA 
ESTIMATION 

Polarization has been incorporated in array processing for improved estimation 
of various signal parameters, such as DOA and time of arrival (TOA) (Ferrara 


and Parks. Il983l : iLi and ComptonL 199 ll : lYamada et all Il997l) . The polari¬ 
metric signature may contain valuable information that the single-polarization 
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FIGURE 6.6 Estimation of 6\ and 0 2 obtained from (a) t-f ML and (b) ML. 


Doppler frequency signature fails to provide. The spatial polarimetric time- 
frequency technique for multisensor receivers uses not only the time-varying 


arization sign atures, whether they 
2004 . 2006al) . Signal polarization 


Doppler frequency signatures b ut also the po 
are stationary or time-varying (IZhang et all 
information empowers the STFDs to achieve enhanced DOA estimation reso¬ 
lution, as it retains the integrity of eigenstructure methods and improves the 
robustness of the respective signal and noise subspaces under low SNR and 
highly correlated signal environments. 
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FIGURE 6.7 Contour plots of (a) the t-f ML likelihood function and (b) the spatial spectra of t-f 
MUSIC. 


In this section, we introduce SPTFD and demonst rate its effectiveness 
by polarime 


Zhang et al. 


ric time-frequency ESPRIT (PTF-ESPRIT) (lObeidat et al. 


2003 


20041). A MUSIC-like method for noncoherent and coherent signals 


is introduced separately in lzhang et al. ( 2003al b. 2006a ) 


6.5.1 Spatial Polarimetric Time-Frequency Distribution 

To utilize the polarization information, an electromagnetic (EM) vector sensor 
can be configured to hold up to six collocated and orthogonally oriented EM 
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sensors an d thus receive up to six EM components ( three electric and three 


magnetic) (IGong et al. 


2009 : lNehorai and Paldi . 1994 ). For an active radar or 


sonar system, the polarization channels can be further extended into combina¬ 
tions of the transmit and receive polarizations, such as vertical-vertical (VV) 
and vertical-horizontal (VH). In this section, however, we only consider pas¬ 
sive dual-polarization scenarios for representational simplicity. The extension to 
higher polarization dimensions is straightforward. 

The received signal with dual polarizations can be expressed as 


x(t) = 


x [p \t) x w (t) 


[<?], 


iT 


(6.34) 


where (.)^ and (.)^, respectively, denote two orthogonal polarizations. They 
can be, for example, vertical and horizontal or right and left circular. 

When an array consisting of m dual-polarized sensors is considered, the data 
vector for each polarization i is expressed as 

x m {t) = A(0)s [/] (0 + n [i] ( t ) (6.35) 


where i represents either p or q. The polarization components of the kth signal 
can be expressed as 

s[ p] (t) = s k (t)cos( Yk ) and s l ^ ] (t) = s k (t) sin(y k ) e j,pk (6.36) 

where tan( y k ) = |sE 9 (0/ s * (01 and <p/ ( denote the magnitude ratio and phase 
difference between the two components, respectively. 

We are now in a position to tie together the polarization, spatial, and t-f 
properties of the signals incident on the sensor array. The following vector can 
be constructed for both polarizations: 




~A(0) 

0 


"s W(r)" 

+ 

'n [p \t) 

x [<?] (r) 


0 

A(0)_ 


s [<?] (0 



(6.37) 

= B(0)s(t)+n(O 


where B(0) = 


is block-diagonal, and s^(t),i=p,q are the 


A(0) 0 

0 A(0) 

source signal vectors for polarization i. The STFD of the dual-polarization vector, 
x(f), can therefore be defined as 


D X x(F/) 


JJ (p(t — u, r)x(t + r/2)x H (t — x/2)e ^ 2jT ^ T dudz 


A(0) 
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i^s[p]§[p] (t (^y) 


~A(0) 0 

A(0) 


(t,y) (t,y) 


0 A(0) 


(6.38) 
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In Equation (16.38b . D xx (f,/) is referred to as the spatial polarimetric time- 
frequency distribution (SPTFD) matrix. This distribution serves as a general 
framework within which DOA estimation techniques can be applied to take 


advantage of diversity provi 

Lded bv the polarimetric and t-f signatures of the 

sources (Obeidat et al.. 

2003 

: Zhan 

g et al.. 

2003 

a b, 

2006a 

). In the next section, 

PTF-ESPRIT ( 

Obeidat et al.. 

2003|; 

Zhang et al.. , 

100^ 

\) is shown as an application 


example of SPTFD. 


6.5.2 Polarimetric Time-Frequency ESPRIT 

To achieve the rotational invariance required to apply the ESPRIT technique, 
we consider a UFA with m identical cross-dipoles. As illustrated in Figure 16781 
the array is divided into two overlapping subarrays, each consisting of m — 1 
elements. Fet the first subarray be composed of the leftmost m— 1 cross-dipoles, 
and the second subarray be composed of the rightmost m — 1 cross-dipoles. The 
array response matrices of the two subarrays are denoted as Ai (0) and A 2 ( 0 ), 
respectively. Accordingly, 


”A 2 (0) 0 


’Ai(0) 0 


■fc 0" 

0 A 2 (0)_ 


0 Ai(©)_ 


0 ^ 


where the rotation operator 4> can be described as 


<I> = diag 


e ~J 2jT i sin(0i) 


^~J 2n f sin($2) 


e ~j 2jT i sin(6>„) 


(6.39) 


(6.40) 


By performing JBD on the SPTFD matrices D xx (t,/) over multiple (t,f) 
points where the energy of the signal arrivals is concentrated, we obtain the 
signal and noise subspaces, represented as matrices and respectively. 
The signal eigenvectors, which make up the columns of S^, span the signal 
subspace such that there exists a transformation matrix T that satisfies 



(6.41) 
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FIGURE 6.8 Partitioning of a ULA into two subarrays. 
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Applying the same transformation matrix T to the steering matrices of the 
two subarrays, we define the following matrices: 


sf = 


"Ai(0) 

0 

T and 

si= 

A 2 (©) 

0 

0 

Ai(0)_ 


^2 

0 

A 2 (©)_ 


(6.42) 


which are related by 


sf = sfT “ 1 


O 0 
0 $ 


T = sf* 


(6.43) 


where the eigenvalues of are e j2;r * sm( A), and i= 1 , 2 ,..., n.ty can be solved 
using the least-squares or the total least-squares approach jRov and Kailath . 


19891) . When least squares is applied, the solution is given by 



(6.44) 


Example 6.7 

We consider a ULA consisting of four cross-dipoles separated by a half¬ 
wavelength. Each comprises one vertical and one horizontal dipole. To demon¬ 
strate the advantages of the SPTFD framework and the PTF-ESPRIT algorithm, 
we consider two sources (sources 1 and 2) with high-order FM waveforms, 
separable in the t-f domain, in the presence of an undesired sinusoidal signal 
(source 3). The DOA of the three sources is, respectively, 0 = — 3°, 3°, and 5°; 
the amplitude ratio between the two polarization components is represented by 
y = 45°, 45°, and 20°; and the phase difference between the two polarization 
components for the three sources is 0 = 0°, 180°, and 0°, respectively. The nor¬ 
malized frequency of the sinusoid is 0.1. All signals have the same signal power 
with input SNR = 5 dB, and the data length is 256 samples. The PWVD with a 
rectangular window of length 65 is used for t-f signal representations. 

For both t-f ESPRIT and PTF-ESPRIT, the array-averaged PWVD is first 
used to identify the auto-term regions. The search-free PTF-ESPRIT provides a 
DOA estimation that is compared to that of conventional ESPRIT, polarimetric 
ESPRIT, and t-f ESPRIT. Because the source signatures can be separated in the 
t-f domain in this example, only the t-f points on the signal signature of a single 
source are considered for the construction of the STFD and SPTFD matrices. 
PTF-ESPRIT outperforms other ESPRIT-based methods by taking advantage 
of such source-discriminatory capability, in addition to SNR enhancement and 
polarimetric selection. 

Figure l63)l shows the RMSE performance of PTF-ESPRIT and other ESPRIT- 
based methods versus the input SNR, where the least-squares approach is used for 
all methods. Each result is obtained from 100 independent trials. For conventional 
and t-f ESPRIT, only the vertical polarization components of the source signals 
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FIGURE 6.9 RMSE performance of ESPRIT algorithms versus input SNR. 


are used. For the t-f and PTF-ESPRIT methods, which can discriminate sources 
based on their TFDs, only the first source signal is selected in STFD and SPTFD 
matrix construction, respectively. The RMSE performance is evaluated for the 
first source signal. It is evident that PTF-ESPRIT outperforms all other methods. 
Polarimetric ESPRIT provides satisfactory DOA estimation only when input 
SNR is high, whereas conventional ESPRIT fails to do so for all input SNR 
levels simulated. In constrast, PTF-ESPRIT provides 1° RMSE when input SNR 
is at a low level of about — 7 dB. 
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6.6 THE SPATIAL AMBIGUITY FUNCTION AND 
APPLICATIONS TO DOA ESTIMATION 


In addition to the t-f domain, a nonstationary signal can be analyzed in other 
joint variable domains, such as time lag (t, r), Doppler lag (v, r), a nd Popple 


frequ ency (v,/). These are related by ID or 2D Fourier transforms (IBoashash . 


20031) . 


Ambiguity domains that describe a signal using the Doppler and lag vari¬ 
ables are found to be useful in constructing the SAF for DOA estimation. The 
ambiguity function of a signal x(t) is defined as 


D xx (v, r) = J x(t+l) X *(t-l) e j2nv ‘dt 


(6.45) 


where v and r represent the Doppler frequency and the lag, respectively. The 
ambiguity function between two signals of x\(t) and X 2 (t) can be similarly 
defined. 
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Thus, for a vector signal x(t) 


(16.61) . the SAF matrix (lAmin et al. 


r eceive d at an array as expressed in Equation 
' 200(1) can be defined as 


DxxO, r) = I x(t+ fjx H - |) e j2jTVt dt (6.46) 

The ambiguity function has several inherent advantages. First, its auto-terms of 
a signal are positioned near or at the origin (v = 0,r = 0).Asa result, selection of 
auto-term points becomes relatively easy. For example, while in the t-f domain an 
LFM signal is characterized by chirp rate and initial frequency; LFM signals with 
the same chirp rate but different initial frequencies share the same trajectory in 
the ambiguity plane. Second, the auto-terms of all narrowband signals, regardless 
of their frequencies and phases, fall on the time lag axis (v = 0), while those of 
the wideband signals fall on a different (v, r) region or spread over the entire 
ambiguity domain. 


6.6.1 Ambiguity-Domain MUSIC 

We introduce now, as an application example of SAF, ambiguity-domain MUSIC 
(AD-MUSIC). As in t-f MUSIC, matrix E = [S, G], which spans the signal and 
noise subspaces of the SAF matrix, D xx (v, r), can be obtained by performing JBD 
on D xx (v, r) at different (v, r) points. Once the noise subspace G of D xx (v, r), 
which is constructed from n 0 signals, is estimated as G, the AD-MUSIC tech¬ 
nique estimates the DOAs by finding the n Q largest peaks of the localization 
function 


/ad (0) = 


a H (0)GG H a(0) 


(6.47) 


Example 6.8 


Consider the scenario of a ULA consisting of four elements separated by a half¬ 
wavelength, where one LFM signal and two sinusoidal signals are received. All 
three signals have the same input SNR of 20 dB. The DOA of the LFM signal 
is 15°, and the DOAs of the two sinusoidal signals are 0° and 10°, respectively. 
The data record has 128 samples. As shown in Figure ETTOl while the ambiguity 
function of the LFM signal sweeps the ambiguity domain with contributions at 
the origin, the exact auto-term ambiguity function of the narrowband signals is 
zero for nonzero frequency lags and may have nonzero values only along the 
vertical axis v = 0. In the figure, the two vertical lines off the lag axis represent 
the cross-terms between the sinusoidal components. 

Suppose that we are interested only in the DOAs of the sinusoidal signals. 
For this purpose, we select 24 points on the lag axis excluding the origin, and 
in so doing, emphasize the narrowband components. Figure |6J~T1 shows the two 
estimated spatial spectra, respectively corresponding to conventional MUSIC 
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Doppler 


FIGURE 6.10 Ambiguity function of an LFM signal and two sinusoidal signals. 



(a) 



(b) 

FIGURE 6.11 Estimated spatial spectra using (a) AD-MUSIC and (b) conventional MUSIC. 


and AD-MUSIC, of five independent trials. There are two dominant eigenvalues 
for the case of AD-MUSIC since the LFM signal has been dropped through 
the selection of ambiguity domain points. It is clear that AD-MUSIC resolves 
the two sinusoidal signals while conventional MUSIC cannot separate three 
signals. 
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6.7 WIDEBAND DOA ESTIMATION 


The discussion in this chapter so far has focused on the DOA estimation of 
narrowband nonstationary signals. However, in many applications the signals 
are wideband. In this case, the DOA estimator should consider the fact that the 
steering vector is now frequency-dependent. 

To estimate the DOA for a general class of wideband signals, conventional 
techniques use Fourier transform to decompose these signals into a set of narrow- 
band components, which can then be processed either incoherently or coherently. 
The incoherent approaches are relatively simple and estimate the DOA from 
the average of the spatial spectra corresponding to different frequency bins. 
However, coherent approaches are often preferred because of their superior 
performance. A popularly used technique, coherent signal-subspace (CSS) pro¬ 
cessing, was propose d bv IWang and Kaveh (119851) and was further developed 


in several papers (see Friedlander and Weiss . 1 1993 . and references therein). The 
fundamental concept of CSS is to use a set of focusing matrices that map the steer¬ 
ing vector at different frequencies into a steering vector at a reference frequency 
before they can be coherently combined. 

Several t-f and ambiguity domain DOA estimati on methods have been devel¬ 
oped for the estimat i on of wideband LFM s i gnals (Gershman and Amin . 2000l : 
Ma and Goh . 2006 : Wang and XiaL l2000b . Wang and Xia d2000h employed 
t-f analysis to estimate chirp rates and iteratively compensated the signal chirp 
structure. A good estimate of the signal DO As was required to initialize this 
iterative processing. By assuming that the wideban d signals are separable in the 
t-f domain and that their IFs do not change rapidly, Gershman and Amin ( 2000h 
used a sufficiently short sliding window to construct the STFD matrices so as to 
preserve the narrowband structure of the array manifold. The focusing matrices 
were then applied to t he STFD matr i ces at selected t-f points corresponding to the 
source t-f signatures. iMa and Gohl (120061) considered the ambiguity domain for 
the DOA estimation of wideband LFM signals with chirp rates that were assumed 
to be known. Multiple chirps with identical chirp rates were allowed in this tech¬ 
nique. Incoherent and coherent processing techniques were also compared (Ma 


and Goh. l2006l) . 

In essence, STFD wideband DOA estimation methods can incorporate CSS 
processing for nonstationary signals, discussed previously. The LFM signal 
properties and the t-f signal representations offered may be used in several 
ways: 


• Decomposition of LFM signals into a spectrum of frequency bins is 
inherently performed in t-f analysis. 

• For LFM signals that have distinct characteristics in the t-f domain, DOA 
estimation can be performed on individual sources. 

• LFM signals are instantaneous narrowband, allowing the focusing matri¬ 
ces to be applied to t-f points. 
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6.8 TIME-FREQUENCY POINTS 

Auto-term and cross-term time-frequency points represent different aspects of 
the contribution of signals in the bilinear transform and they behave differently 
in DOA estimation. This section considers the behavior of cross-terms in DOA 
estimation and the proper selection of auto-term and cross-term points. 


6.8.1 Auto-Term and Cross-Term Points 


As discussed in Section 16.21 cross-terms are a by-product of the TFD’s 
bilinea rity. Although different kernels have differe nt ways of mitigating cross¬ 


terms (ICohen’sL 1 19951 : iJeong and Williams . 


19921) . their complete removal is 


nevertheless very difficult to achieve. 

There are two types of cross-terms in the underlying DOA estimation prob¬ 
lems. The first type is due to interactions between the components of the same 
source signal. These cross-terms always reside, along with the auto-terms, on the 
main diagonal of the source TFD matrix. They share the same steering vector as 
that of the auto-terms and thus can be similarly treated. The other type of cross¬ 
term is generated from the interactions between two signal components belonging 
to two different sources. These are associated with cross-TFD of the source sig¬ 
nals and, at any given time-frequency point, they constitute the off-diagonal 
entries of the source TFD matrices. Here we consider the second type. 

To understand the role of cross-terms in DOA estimation, it is important to 
compare them to the cross-correlation between signals in conventional array 
processing, with properties that are well studied. When cross-terms are present 
at the selected t-f point, they appear as off-diagonal elements in the source TFD 
matrix. On the other hand, when signals are correlated, the off-diagonal elements 
of the covariance matrix of the source signals represent the cross-correlation 
between two source signals. DOA estimation problems can usually be solved 
when the signals are partially correlated, provided that the full-rank property of 
the source signals’ covariance matrix is maintained. 

The cross-correlation terms and the cross-term TFDs have an analogous form 
and similar function. That is, cross-term TFDs can be exploited in DOA estima¬ 
tion as lo ng as the full-rank subspace of the STFD matrix is achievable (A min 
and Zhang j2000l) . It is noted that the covariance matrix is obtained as a result of 
statistical or ensemble averages, whereas the STFD matrix is defined at a ( t,f) 
point and its value usually varies with respect to time t and frequency/. When 
multiple (7,/) points are incorporated, the effect of a cross-term may be reduced, 
since the cross-term usually oscillates with respect to time. 


6.8.2 Selection of Time-Frequency Points 

The selection of t-f points is relatively straightforward when nonstationary 
signals are present with clear IF signature and moderate-input SNR. Most TFDs 
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result in high values in the vicinity of the same true IF signatures as their 
auto-terms. A problem may arise, however, when the signals have a low SNR 
and their IF signature in the t-f domain is not clear. In addition, we may want to 
separate auto-term from cross-term points. For example, including cross-terms 
between selected and other sources may compromise the source discrimination 
capability in DOA estimation. 

A relatively simple way of enhancing auto-term TFDs and suppressing cross¬ 
term TFDs is through computing the TFDs corresponding to the signals obtained 
at each i ndividual receiv e sensor and then averaging the TFDs across the array 
sensors (IMu et all 120031) . Averaging introduces a weighing function in the t-f 
domain that decreases the noise levels, reduces the interactions of the source 
signals, and mitigates the cross-terms. This is achieved independently of the 
temporal characteristics of the source signals and without causing any smearing 
of the signal terms. 

The TFD of the noise-free signal received at the ith array sensor, Xi(t) = 
Y^=\ a i(@k)sk(t), where at (6k) is the ith element of steering vector a(^), is 
expressed as 


n 


n 


(6.48) 


D Xi Xi ( t,f ) = 2 ^ 2 ^ ai ^ a i ^ D SkSi (T/) 

k = 1 /=1 

The averaging of D XiXj (f,/) for i = 1,..., m yields 

-j m n n 

Dxxihf) — 'y ] DxiXi(t,f) = y ' PkjDsfrSi (t ,/) ( 6 . 49 ) 

171 i= 1 k= 1 /= 1 

where fk,i = a^a//ra is the spatial correlation coefficient between the kth and 
the /th sources. The average of the TFDs over different array sensors is the trace 
of the corresponding STFD matrix, D xx (t,/), up to the normalization factor, m. 
With the introduction of the spatial signature between two source signals, it 
becomes clear that pkj is equal to unity for the same source signal (i.e., k = l , 
corresponding to the auto-term t-f points) and that its magnitude is smaller than 
unity for two different source signals (i.e., k^l, corresponding to the cross-term 
t-f points). 

In the presence of additive noise, the received signal becomes 


n 


Xi(t) = ^2ai(0 k )s k (t) +rii(t) 


(6.50) 


k= l 

and nft) is the additive Gaussian noise. The TFD of Xi (t) is expressed as 

n n 

D XiXi ( t,f ) = J 2 (Ok) a* m D SkS , (t ,/) 

k= 1 /= 1 

n 

ai (Ok) D Sktlj (t ,/) 

U=l 


+ 2Re 


(6.51) 


T Dnmi if if) 
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When the noise elements are assumed independent of each other and of the 
signals, averaging over the array elements will reduce the variance of both the 
second and third terms at the right side of Equation (16.51b by a factor of m. 

To make the selection of auto-term points meaningful, it is often desirable 
that the selected t-f points have enough significance in the t-f domain. This can 
be accomplished by judging whether the TFD strength at a t-f point exceeds a 
certain preset level. The TFD strength indicator may be chosen as the sum of the 
absolute value of the auto-sensor TFDs, Y1T=i lAc/*,-(*>/)!> or, alternatively, as 
the norm of the STFD matrix, norm[D xx (7,/)]. Then the auto-term t-f points can 
be selected by comparing the ratio between the average of the auto-sensor TFDs 


(or, equivalently, t 

le trace of the STFD matrix) a 

ind the TFD strength indicator 

(Belouchrani et al. 

2001 

: Finh-Trung et al..l200f 

5: Zhang and Amin. 2006). On 

the other hand, a t-f po: 
above ratio is below a ce 
Although the array a> 
identify some false auto 
sources is high (e.g., th 
selection can be improve 
the source spatial signati 

int can be considered as 
rtain threshold. 

/eraging or trace of the < 
-term t-f points when fi 
e sources have close D 
^d by prewhitening the r 
res before TFD averagi 

> a cross-term point when the 

STFD matrix is simple, it may 
le spatial correlation between 
OAs). In this case, auto-term 
*eceived data to orthogonalize 
tig or matrix tracing is applied 

(Belouchrani et al.. 

2001; 

Finh-Trunget al.. 2005: 

Zhang and Amin. 2006). When 


the condition m > n i s satisfied, a convenient way to do this is to compute the 
prewhitening matrix (IBelouchrani and Aminl . 1 19981) as 


W = 


® I n 


) 


- 1/2 


i. H 


(6.52) 


where S is the signal subspace and is the diagonal matrix containing the 
eigenvalues corresponding to S. This yields in the nx 1 pre whitened data vector 


z(f) = Wx(0 


(6.53) 


which, in the absence of noise, has the nxn identity matrix as its covariance 
matrix. The prewhitened matrix can be derived from an ST FD matrix to take 
advan tage of SNR enhancement and source discriminations dzhang and Amin . 
20061 ) . 


Example 6.9 

Consider an eight-sensor UFA with inter-element spacing of a half-wavelength. 
Three FFM signals, s\ ( t ), S 2 ( t ), and (t), arrive at the array with angle of arrival 
(AOAs) of —25°, 0°, and 25°, with the respective start and end frequencies given 
by (0.45,0.25), (0.33,0.13), and (0.25,0.05). The length of the signal sequence 
is set to 256. In the t-f plane, the source signals have parallel signatures. The 
cross-term of s\ (, t ) and S 2 (t) also forms a structure with a frequency that changes 
linearly from 0.35 to 0.15 and therefore lies close to the t-f signature of s^(t). 
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FIGURE 6.12 WVD with and without array averaging: (a) WVD in the noise-free case at the first element; 
(b) array-averaged WVD in the noise-free case; (c) WVD of the corrupted signal at the first element; and (d) 
array-averaged WVD of the corrupted signals. 


Figure I67T21V ) depicts the WVD of the signals at the first sensor in a noise- 
free environment. With averaging of the WVD over the eight sensors, substantial 
cross-term reduction is achieved because the spatial signatures of the sources 
are weakly correlated. Figure 16.12T bT shows the corresponding array-averaged 
WVD. As a result, averaging WVDs across the array significantly reduces the 
cross-terms, whereas the three signals’ auto-terms remain intact. 

Next, we add spatially and temporally white Gaussian noise to the data at each 
array sensor so that the input SNR is —10 dB. Figures l6J"2t c) and l6.12f d) depict 
both the reference sensor WVD and the array-averaged WVD. It is evident that 
the noise obscures both the signal auto-terms and the cross-terms of the WVD at 
a single sensor, so it is difficult to find appropriate t-f regions that contain high 
energy of the signal arrivals. Upon averaging, both noise and cross-terms are 
sufficiently reduced to clearly manifest the individual source t-f signature, and 
the signals can be individually selected for DOA estimation. 
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6.9 CONCLUSION 

In this chapter, we reviewed the spatial time-frequency distribution concept and 
various STFD-based DOA estimation techniques. STFD provides a very flexi¬ 
ble platform on which to perform effective DOA estimation for nonstationary 
signals. This is because the similar structures of STFD and data covariance 
matrices allow the use of virtually all subspace-based DOA estimation tech¬ 
niques. Compared to conventional DOA estimation techniques, those based 
on STFD provide significant performance improvement because of they are 
capable of enhancing signal-to-noise ratio and source discrimination in the 
time-frequency domain. 

Extensions of the STFD concept to spatial polarimetric time-frequency dis¬ 
tribution (SPTFD) and wideband signal applications were also addressed. We 
included various examples of STFD- and SPTFD-based DOA estimation tech¬ 
niques built on the theory underlying the MUSIC, maximum likelihood, and 
ESPRIT methods. 
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DOA Estimation in the 
Small-Sample Threshold 
Region 


Yuri I. Abramovich, Ben A. Johnson, Xavier Mestre 


7.1 INTRODUCTION 


Methods and fundamental limitations of detection/estimation techniques for mul¬ 
tiple closely spaced emitters in noise under small-sample support conditions 
have always been treated as an important but difficult analytical problem. Yet 
most of the theoretical results from estimation of the number of emitters (detec¬ 
tion) and their direction of arrival (DOA) in M-element antenna arrays still 
rely on large-sample assumptions. In particular, well-known attempts to ana¬ 
lyze resolution and detection limits dKaveh and Barabell 1 19861 : iLee and Li . 
1993 : Zhand. 1995 ) have been derived under the traditional asymptotic assump¬ 
tion regarding the number of independent identically distributed (i.i.d.) training 
samples T : 


M = const, T^oo(M/T^0) (7.1) 

In some cases, the signal-to-noise ratio (SNR) has been alternatively assumed 
to be asymptotically large, though the two asymptotic assumptions 

M = const, SNR = const, T —> oo (7.2) 


Note: This chapter is partially based on “MUSIC, GMUSIC, Maximum Likelihood Performance 
Breakdown” by B. A. Johnson, Y. I. Abramovich, an d X, Mestre, which a ppeared in IEEE Transac¬ 
tions on Signal Processing in August, © 2008 IEEE Ijohnson et all1 20081) . as well as “GLRT-Based 
Detection-Estimation for Under-Sampled Training Conditions” by Y. I. Abramovich and B. A. 
Johnson, which appeared i n IEEE Transactions on Signal Processing in August, © 2008 IEEE 
lAbramovich and Johnson! 2008 ). The work was conducted under RLM/DSTO Collaboration 
290905. 
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and 


M = const, T = const, SNR —> oo 


(7.3) 


are not always interchangeable. Athleys’s observation (lAthlevl 120021) . that for a 
small T and SNR —>► oo the DOA estimates are not efficient (the y do not reach 
the Cramer-Rao bound), was analytically reinforced recently bv lRenaux et al. 


(120041) for the traditional Gaussian model with m independent sources (m<M). 


Therefore, if Equations (17.21) and (17.31) are not interchangeable even 
for sufficiently large T —> oo, SNR^ oo, where the performance of DOA 
detection/estimation techniques is high, one can expect even more pro¬ 
found distinctions between large-sample training conditions (T^>M) and 
low-sample training conditions (T <M), where detection/estimation pro¬ 
cedures reach their limits. In fact, in the low-sample support case, we 
are in the circumstance where not only do the actual detection/estimation 
performance of various techniques approach fundamental limits, but the 
traditional asymptotic (T —> oo, SNR^ oo) analysis techniques lose their 
ability to adequately describe this performance. This severely complicates 
analysis of “threshold” phenomena as the detection/estimation techniques 
break down and produce large errors that depart from the Cramer-Rao 
bound (CRB). 

The question with respect to asymptotic assumptions (Equations 17.21 and 
1751 —namely, how large T and/or SNR should be so that the asymptotic analysis 
accurately describes a considered scenario—is frequently asked. An interesting 
answer was provided by Asym R Tomaniac, Ph.D., in the letter “A symptoma- 
nia?” submitted by P. Stoica to IEEE Signal Processing Magazine (ITomaniad. 


19981) : 


To summarize, a possible (brief, admittedly vague, yet deterrent) answer to the question 
asked ... is: If “my” asymptotic results do not apply to “your” scenario, then you have a 
“large-error case” and, hence, you are in trouble anyway. It is true that such an answer may 
generate further questions: 

• How do I know that I am in a “large-error” case (which could also be called a “threshold 
regime”)? 

• Is it really true that I cannot do anything about it? 


This threshold effect is observed in any DOA estimation algorithm, includ¬ 
ing maximum likelihood (ML) estimation, with algorithm- and scenario-specific 
onset points. In the case of multiple Gaussian signals impinging on an M-element 
antenna array, maximum likelihood has been broadly treated as providing 
“benchmark” estimation accuracy if effectively implemented. But in most prac¬ 
tical multiple-source cases, we are not able to perform a globally optimal ML 
search for the DOA estimates, and therefore we want to consider Stoica’s 
concerns for other detection/estimation techniques relative to maximum like¬ 
lihood performance. To conduct such an analysis in a likelihood framework, we 
restructure the previous “admittedly vague” problem statement into the following 
three specific practical questions: 
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1. How close to the “true” maximum likelihood is any model RmodiwO^H 

2. If R mo( i{ni) is away from the true maximum likelihood, can I rectify the 
DOA estimates to come “sufficiently close” to the true ML condition? 

3. If the (rectified) R mo d(pi ) is sufficiently close to the true ML condition, 
how likely is it to still contain a severely erroneous condition (an outlier)? 

Several separate important issues are raised by our questions. The first 
is whether there is a gap between the threshold conditions for a particu¬ 
lar detection/estimation technique and more fundamental threshold conditions 
for the maximum likelihood principle itself. If such a gap exists, could the 
method-specific performance breakdown (i.e., when R mo d (fh) contains a severely 
erroneo us DOA estimate! be reliably predicted and cured (A bramovich and 
Spencer, |2004 lHawkes et all 120011) ? What type of theoretic analysis should 
support such prediction and cure attempts? 

The other important issue is whether anything can be done when even the 
truly maximum likelihood estimator produces a severe outlier. Given a particular 
(global) ML estimate R mo d(m ), can we predict that we are in the maximum 
likelihood performance breakdown region? Can we at least identify the DOA 
estimates that cannot be trusted? 

In this chapter, we address the first issue, since our goal is to produce 
detection/estimation solutions that for a small-sample support and SNR are 
statistically as likely as the maximum likelihood solutions, without having to 
invoke an impractical global ML search. The second issue is not addressed here, 
but it must be remembered that breakdown of maximum likelihood itself can¬ 
not be addressed within the likelihood paradigm we employ. Only a change 
in the paradigm, and specifically an introduction of additional a priori infor¬ 
mation regarding the sources can change the maximum likelihood estimation 
(MLE) performance itself, often quite dramatically. For example, assumptions 
that the sour ces are all of the same power change MLE resolutions limits 


significantly ( Abramovich et al. . 20081) 


7.2 DOA ESTIMATION IN THE THRESHOLD REGION 

Detection/estimation techniques the performance breakdown of which we intend 
to address are well-known “ML-proxy” routines such as MUSIC, which asymp¬ 
totically (under conditions given either in Equation 17.21 or 1731 ) approach MLE 
performance. This asympto tic equivalence bounded by the CRB, established by 
Stoica and Nehorai (11989 ). obviously does not mean their equivalence every¬ 
where, and especially in the threshold region. In fact, large-sample ML surrogates 
such as MUSIC may demonstrate a completely different threshold behavior. 
While it is recognized that the performance of these surrogates is “in general 


1 . R mo d (fii) can be formed by using any particular detection/estimation technique to estimate the num- 

yv yv y\ y\ 

ber of sources m, DOAs 9 \,..., 6 m , and source powers P \,..., P m , and then using these parameters 
to form an M xM covariance matrix model of the input data. 
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inferior to the ML technique, espe cially in the regime of low signal-to-noise 


ratio or small number of samples” (IHurtL 


1990h . such surrogates are often the 


only practical alternatives to the multidimensional ML global search. Thus, this 
difference is often not fully exp lored. _ 

For example, the authors in iLee and Lil (119931) argued that for very closely 
spaced sources, the detection SNR threshold eirc below which the number of 
sources is underestimated by information theoretic criteria (ITC) is significantly 
lower than the SNR threshold €res at which the sources can be resolved and 
estimated. Yet in order to illustrate this fact, they compared the actual simulation 
results on ITC-based dete ction with the theoretic for mula on the MUSIC reso¬ 
lution threshold derived bv lKaveh and Barabefi (1986), arguing that “this expres¬ 
sion previously has been validated via extensive simulations.” The fact that MLE 
and MUSIC DOA estimation have completely different threshold regions was 
not considered because of the lack of an accurate MLE proxy in the analyzed 
circumstance. 

This situation recently changed conceptually, after a reliable and practical 
approach for MUSIC-specific performance b r eakdown prediction and cure wa s 
sug gested in Abramovich and Johnson ( 2008 ). [Abramovich and Spencer] ( 20041) . 
and I Abramovich et al.l (I2007d) . This approach is based on two quite straightfor¬ 
ward observations. The first is that for the (unconditional, stochastic) Gaussian 
model, the likelihood function (i.e., the probability density of the model param¬ 
eters given the observed data) may be normalized to lie between zero and unity 
in a way that the resultant likelihood ratio (LR) for the true (actual) covariance 
matrix as an argument (i.e., LR(Rq)) is “scenario-invariant.” By this we mean 
that the LR is described by a probability density function (p.d.f.) that does not 
depend on this matrix Ro , but rather is fully described by a function of known 
parameters such as M and T. The second observation is that a true ML model 
Pml(™) for m > m should always (deterministically) exceed the LR(Rq): 


LR[R M L(m>m)]>LR(R 0 ) 


(7.4) 


These two observations allow for a statistical nonasymptotic answer to our ques¬ 


tion regarding the quality of the model R mn( j (m 
called “expected likelihood” ([Abramovich et al 


With in the approach we have 


20041) . Rmodifii ) derived by any 


detection/estimation technique is treated as appropriate if 


LR[R m od(fn )] > Ylr(Pfa) 


(7.5) 


where 


Ylr(Pfa) = arg y 



w(x, M, T)dx = Pea 


(7.6) 


and w(x , M, T) is the previously discussed p.d.f. for LR(Ro). By the thresholding 
in Equation J7.5b . we check that with probability 1 —Pfa 0, the tested model 
Rmod(w) is least as likely as the true covariance matrix Rq. Note that the tested 
model may have a likelihood that is not truly global, since for a sufficiently low 
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PfA (10 -3 , say), we have to set the threshold ylr well below the median value 
of LR(Rq ) and therefore, in most cases, also below the global LR maximum 
that remains unknown. However, even with this statistical but nonasymp totic 
approach, we were able to demonstrate in Abramovich and Spencer d2004 ) and 
Abramovich et alJ (12007a) that for T -> oo, SNR —> oo, covariance models gen- 
erated from MUSIC DOA estimates and source-power estimation such as in 


Otters ten et alJ (119931) reliably exceed the threshold in Equation (17.51) . in full 


accordance with Stoica and Nehorai’s predictions (1989). 

More important, we found that in cases where T and/or SNR decrease to 
threshold values where MUSIC loses its ability to resolve closely spaced sources 
and starts to produce outlier DOA estimates, the likelihood ratio for R mo d (™) con¬ 
structed from these “broken” solutions is reliably below the statistical threshold 
Ylr in Equation (17.51) . The fact that MUSIC outliers were found to be much less 
likely than the true covariance matrix Ro provided a straightforward capability to 
specify (predict) such an outlier, as requested by our first question, and provided 
an approach to replace (“cure”) the outliers by a solution that meets the threshold 
in Equation (17.51) . as considered in our second question. The nonasymptotic and 
nonclairvoyant nature of the p.d.f. for LR(Rq) played a crucial role in the success 
of this approach. 

Howev er, the development of this performance breakdown “predict and cure” 
technique ( Abramovich and Spencer! 2002 : Abramovich et all 2004 ) was con¬ 
ducted for scenarios with the amount sample data T still exceeding the antenna 
dimension M (T >M). This assumption not only enabled likelihood ratio for¬ 
mation from quite conventional normalization of likelihood functions, but also 
retained the possibility for conventional asymptotic (T oo) statistical analysis 
to be considered for threshold behavior description. In contrast, in this chapter we 
focus on applications where the sample support T, while significantly exceeding 
the number of sources m, is similar to or even smaller than the number of antenna 
elements M. That is, when 


in <T <M 


(7.7) 


The number of practical applications where one has to operate in such an 
“undersampled” training regime is growing. Antenna systems in radars, sonar, 
and communications systems are either growing in array size or are increas¬ 
ingly relying on receiver-per-element rather than subarray architectures ([Pearce, 
19981) . At the same time, the ability to collect the amount of sample data that 
exceeds this array dimension is in most cases limited by the heterogeneity of the 
environment and is therefore not readily expandable. 

Operations in this undersampled regime require a significant shift in the statis¬ 
tical framework. First, it is necessary to introduce new undersampled likelihood 
ratios with the same invariance of p.d.f. for the true covariance matrix Rq (see 
Section 17.31) . We then need to demonstrate that for this undersampled train¬ 
ing regime, “broken” MUSIC solutions may be once again reliably identified 
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because of their significantly smaller likelihood compared with the unknown 
actual scenario (see Section IT/fl) . 

Even more important than the derivation of necessary tools such as undersam¬ 
pled likelihood ratios is the introduction of an appropriate theoretical framework 
for description of MUSIC’S threshold behavior that does not rely on standard 


olution thresholds 

(Kave 

i and Barabell. 

1986 

: Lee and Wenerovitz. 

199(4 Xu 

and Kaveh, 

1994: 

Zhang. 

1995 

) have been derived for conventional (T -> oo) 


asymptotic assumptions, which for T <M is obviously not valid. 

Moreover, even the physical phenomena previously associated with the 
MUSIC breakdown phenomena have to be reconsidere d. For a long time, M USIC 

V Tufts 


breakdown has been associated with “subspace swap” (Thomas et al 


1995 


et al.. ll99lh . where one or more sample eigenvector-eigenvalue pairs associated 
with the signal s ubspace “actually esti mate noise eigenelements instead of signal 
eigenelements” ( Hawkes et all 2001 ). This transition behavior of the eigende- 


techniaue called Random Matrix theorv IRMT) ( 

Beenakker. 

1991 

Lde IV 

onvel 

et al., 

1995 

), or general statistical analysis (GSA or G-analysis) ( 

Girko. 

1987, 


19981) . 


Unlike the traditional asy mptotic (M = const, T -> oo) assumption, RMT/ 
GSA considers quite different (IKolmogorovLll956l) asymptotics, when M and T 
concurrently go to infinity but maintain a constant ratio (M, T —> oo, M/T —> y, 
0 < y < oo). Therefore, even as the sample support T and the antenna dimension 
M grow without bound, the ratio M/T tends to a fixed number that can be 
larger than one, meaning that the sample support as a percentage of array size 
remains constant, making this asymptotic framework more appropriate for signal 
processing applications in the small-s ample support case (IMestre and Lagunasl 
2008 : Nadakuditi and Edelman . 20081) . 


To validate this new statistical tool that can embrace these undersampled 
training conditions, we need to conduct this alternative asymptotic analysis and 
compare it with Monte Carlo analysis for MUSIC and MLE breakdown condi¬ 
tions, allowing a statistical description of the threshold phenomena, as shown in 
Sections [731 and lT]6l This allows us to clarify the physical phenomena responsi¬ 
ble for the threshold behavior of our detection/estimation techniques. We close 
the chapter with a summary of these results in Section [7771 


7.3 EXPECTED LIKELIHOOD FORMULATIONS 

The expected likelihood technique introduced in lAbramovich et al' ( 2004 ) 
provides a number of key features desired for adaptive processing assessments: 
The algorithm is based on finite-sample assumptions, provides nonclairvoyant 
performance assessment, and operates with scenario distributions that are oper¬ 
ationally relevant. To summarize this approach, let us first examine some 
fundamental results that rely on known finite-sample distributions, not for the 
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parameters themselves but for an important function of the parameters—their 
likelihood ratio. Specifically, let us examine the Gaussian case, for which the 
covariance matrix is the complete model identifier. 

A standard approach to determining how closely a particular covariance 
matrix model matches T samples of M-dimensional multivariate (zero-mean) 
complex Gauss ian sample data (which is complex Wishart (CW) distributed 
( Wishart . 1928 ) when T > M, denoted Xt = [x i ,..., x^], x t e CW( 0, Ro) when 
the underlying process has covariance Ro), is to take the sample covariance 
matrix Rm : 


R M = l;J2 x J x j 1 


(7.8) 


7=1 

and form a likelihood function w.r.t. the covariance matrix model R mo d' 

T 


£ (Xt , Rmod ) = 


1 


exp{-Tr [RmldRM]} 


It dct Rmod 


(7.9) 


Normalization of this likelihood function 

rD/D . C{X T ,R mo d) 

LKyiymod) — 


£( x t ? Rmod ) 


(7.10) 


leads to a likelihood ratio (after the methodology in iMuirheadll 19821) 

-iT 


LR(Rmod) — 


dot R m l odRM expM 


exp{Tr RmodRM] 


< 1 


(7.11) 


since 


ma x C(X T , R mod ) = 

Rmod 


exp (—M) 


_ 7i det Rm _ 


-iT 


for Rmod — Rm 


(7.12) 


Hypothesis testing in circumstances where training sample data and primary 
sample data have a scalar offset can also be supported. Such LR formulation is 
based on hypothesis testing: 


/V 

Ho . £ {Rm } = cRmod c > 0 


(7.13) 


versus 


H 1 :£{Rm] 7 ^ cRmod 


(7.14) 


and the resultant scale-invariant “sphericity” LR (for T >M) (IAndersonUl958 
MuirheadU 19821) is introduced as 


LRsp (Rmod ) — 


det Rj\d R mod 
_{^Tr R M R m \ jd \ M _ 


-i 


(7.15) 
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These likelihood ratio values are a metric on the quality of the assumed 
covariance matrix model relative to the observed sample covariance. The p.d.f. 
of the LR provides significant information about the estimation problem, and 
the maximal LR value is of course associated with the maximum likelihood 
solution, but the actual likelihood value itself is often not directly utilized. It is 
indirectly utilized in tests based on the ratio of different likelihood ratios, such 
as the generalized likelihood ratio test (GLRT), where the ratio of the maximal 
LR(R moc i) given a particular hypothesis (max//, LR(R moc [\H\)) to the maximal 
LR(R mo d) given the n ull hypothesis (m ax// n LR(R moc i \Hq)) is thresholded to make 
a detection decision (Ivan Treesll 19681) . Expected likelihood makes much more 
direct use of the LR value itself. Specifically, expected likelihood compares 
the achieved likelihood ratio (using the form given in either Equation (17.11b 
or (17.15b ) for any particular covariance estimate to statistical measures of the 
likelihood ratio associated with the (unknown) true model. 

Of course, the likelihood ratio associated with the true model (denoted Rq) 
is in general unknown. Yet under some circumstances, we can gain a priori 
knowledge of the p.d.f. of the likelihood ratio of that true mod el, even if LR (Rg) 
and Rq itself remain unknown. While it has been long known (I WilksL 1 1938n that 


under mild regularity conditions on the sample data, the maximum likelihood 
estimator has a known asymptotic distribution, we are by definition not interested 
in the asymptotic case (T = oo), as we consider the low training-sample volume 
circumstance. If instead we restrict the sample data to be complex Gaussian 
(denoted CM) (i.e.,x t ^CM( 0, Rg)), then the distribution for LR(Rg) is described 
analytically in 

resorting to asymptotic considerations. 


Consull (1 19691) (where the real-valued case is described), without 


In fact, an exact dis 


ribution can be derived for any spherically invariant 


random process (SIRP) (lBrehmlll982l) . of which the Gaussian process is a (sta¬ 


tionary and ergodic) subset. For the complex Gaussian case, a n expression for 


the p.d.f. of LR(Rg) has been derived (lAbramovich et al. 


2002al) and is described 


in terms of the Meyer’s G-function, depending solely on the number of samples 
T and the data dimension M (i.e., the p.d.f. of LR(Rg) is independent of any 
scenario parameters and is therefore scenario-invariant). Since evaluation of the 
G-function is difficult, the p.d.f. can be approximated dMarquesa and Coelhol . 


20081) . or, since it does not depend on any scenario parameters, it can be gener¬ 


ated numerically using Monte Carlo techniques. In addition, the hth moment is 
given by 


£[LR(R 0 )] h = 


T mt exp (Mh) f/ T(T+ I -j + h) 


M 


{T + h) M{T+h) 


n 

7=1 


r(r +i -j) 


(7.16) 


where T represents the Gamma function. The p.d.f. can then be expressed in 
serial form by applying an in verse Mellin transform to this moment function 
( Nagarsenker and Pillai . 19731) . Note that, as expected, the moments (and there¬ 
fore the underlying p.d.f.) do not depend on Rg and are fully specified by 
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parameters M (the dimension of the data snapshots), T (the number of data 
snapshots), and h (the moment index). 

With the p.d.f. of LR(Rq) determined, any appropriate quantile can now be 
chosen for use in an expected likelihood estimation process. If the likelihood 
of any given set of estimates exceeds the chosen threshold (which statistically 
represents the likelihood associated with the true model), the covariance model 
is considered accept able. Although there are some per formance penalties asymp¬ 
totically as T —> oo ( Abramovich and Johnson . [2007 ) compared with the global 
ML estimate, in the low training-volume regime, which is our primary focus, the 
technique provides a robust means to produce a “quality assessment” of potential 
solutions. 

This has a number of very important applications. For example, since in 
the low training-sample regime adaptive filters with no higher complexity than 
necessary should be used, a key step is to determine the appropriate free parameter 
space of the covariance model. Simply choosing a model that results in a maximal 
likelihood ratio is unsatisfactory, as it always drives the covariance matrix free 
parameter order to a level commensurate with the dimension of the covariance 
matrix (in the case of a structured model) (see Equation l7.4l) . or provides us with 

/V 

the sample covariance matrix itself (which has a value for LR(Rm) of unity) if 
our model is unstructured. To achieve a parsimonious model, the likelihood ratio 
must be driven to a high enough value but no higher. The likelihood ratio of the 
under lying true model is often m uch lower than the maximum likelihood ratio. 

In Abramovich et al. J2007bh . the absolute maximum LR(Rq) values associ¬ 
ated with a properly ordered model for a 12-element antenna with 24 samples 
(i.e., a circumstance where the sample cov ariance matr i x is re asonably repre¬ 
sentative of the true covariance matrix Rq; iReed et all 1 19741) were shown to 


average around 0.03 and never exceeded 0.1, compared to the LR value of unity 
associated with the unstructured MLE estimate (Rm ), which has an LR of unity. 
Therefore, setting the complexity of the model high enough to drive the like¬ 
lihood ratio close to its ultimate maximum results in a much more intricate 
model than necessary. Instead, the expected likelihood approach can provide 
the means to estimate a sufficiently accurate, but still parsimonious, parametric 
model complexity. 

Despite its demonstrated utility, as previously formulated, the expected like¬ 
lihood technique cannot be applied to the undersampled regime (discussed in 
Section lTSl) . where the number of i.i.d. training samples T is less than M. In that 

/V 

case, the sample covariance matrix is rank deficient (det7?M = 0) and the nor¬ 
malization used in Equation (17.111) is no longer available, even though adaptive 
processors can still be applied. 

The lack of a classically normalized likelihood ratio in the undersampled 
regime means that the potential applications of expected likelihood in solution 
assessment or order estimation are not available or require modification in this 
circumstance. 

For the undersampled case, a likelihood ratio LR u (R mo d) is needed that 
satisfies the following conditions: 
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Normalization condition: 


0 < LR u (R mod ) < \ 


(7.17) 


Transition behavior: LR u (R mod ) should be an “analytic extension” of LR(R mod ) 


(17.11b . That is, 


LR U ( Rmod ) = LR(R mod ) for T>M 


(7.18) 


Invariance property: 


p.d.f.[LR u (R 0 )]=f(M,T) 


(7.19) 


The covariance matrix Rm in Equation <7.81) is rank deficient whenT < Man d 
therefore is described by the anti-Wishart distribution ( Janik and Nowak . 20031) . 


Rm cannot be used directly in the likelihood ratio (17.111) since its determinant 
is zero, so a substitute for it that will serve the same purpose (i.e., represent 
the information in the sample data), but have a nonzero determinant, needs to be 
found. In contrast to the oversampled case, there is no single “correct” likelihood 
ratio and several formulations can be considered, including 

• Contraction/projection of the sample matrix to a smaller non-rank-deficient 
matrix 

• Extension of the sample matrix in a manner that provides a nonzero 
determinant 

The first approach selects a subset or transform of the sample matrix that 
is not rank deficient and uses that in the likelihood ratio itself. That option is 
explored in Section 17.3.11 The second approach adds to the sample matrix in 
some ordered way so that the rank is expanded to the point that the matrix is 
no longer rank deficient. This option is explored, based on a maximum entropy 
extension, in Section 17. 3.21 

Other options certainly exist, including perturbations of existing hypothesis 
tests used in the statistical mathematics lit erature i n case s where the sample 


covariance matrix is singular dJohnl . 11973 


NagaoL 119731) . But the presented 


options provide all the features needed to use them as undersampled likelihood 
ratios in an expected likelihood framework, thereby opening up the application 
of expected likelihood to this important regime. 

To support these derivations, let us first define the circumstance of inter¬ 
est. For the expected likelihood approach to function properly, the number of 
training samples T has to be sufficient to at least capture the salient features 
of the signal subspace, where the signal subspace is defined as follows: When 
the number m of the covariance matrix eigenvalues that exceed the minimal 
eigenvalue (equal to ambient white noise power) is smaller then the matrix dimen¬ 
sion M (m < M), the following form for admissible covariance matrices can be 
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introduced: 

Ro = GqIm+Rs\ Rs=UmMUm\ Ao = A m - Cf^Im (7.20) 

where U m is the M x m matrix of signal subspace eigenvectors and A m is the 
m- variate diagonal matrix of (positive) eigenvalues. 

We now proceed to examine ways to devise a likelihood ratio in the 
undersampled domain that satisfies our key requirements. 

7.3.1 Subspace Projection Formulation of the 
Undersampled LR 

As noted previously, in the undersampled regime where the number of i.i.d. 

A 

training samples T is less than M, the sample covariance matrix Rm is rank 
deficient. This means that since the sample data spans only a limited linear 
subspace with dimension T < M, any inference on a covariance matrix model 
may be provided only regarding the properties of this model within this subspace 
that is spanned by the sample data. 

The property of undersampled testing raises identifiability concerns, and 
requires that projection of any model of the covariance matrix (denoted R mo d) 
onto the subspace spanned by the sample data uniquely identify the entire 
model Rmod- Along with this, a finite subspace model as introduced in Equation 
(17.201) can be parameterized as a summation of m (not necessarily independent) 
planewave sources: 

Rmodim) =c%I M + S{0) m B m S n (0) m - B m e H mxm > 0 (7.21) 

where 

S(0) m = [s(0i),..., s(6 m )] e CN Mxm (7.22) 

is the M x m antenna array manifold matrix for the m DO As 0 m = \0\ ,..., 0 m ], 
B m is the mxm Hermitian (denoted %) p.s.d. inter-source correlation matrix, 
and <Tq is the additive white noise power. 

One can expect that for 

T >r, r — rank(7? m ) < m (7.23) 

inferences on the covariance matrix may deliver required detection/estimation 
results on the M x m (or M x r) signal subspace of the input covariance matrix, 
since the T data samples adequately span the signal subspace. 

Let the eigendecomposition of the sample covariance matrix Rm in 
Equation (17.81) for T <M be presented as 

Rm — b^T A (7.24) 

where 

Idpblj = It , A^ = diag(X\ > A ,2 ■ • • A.r > 0) (7.25) 
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Then any inference about the covariance matrix model R mo d (not necessarily 
parameterized as R mo d( /w)) may be provided only regarding its “projection” onto 

the linear subspace spanned by the M xT matrix of eigenvectors U T associated 
with the T nonzero eigenvalues. 

A likelih ood ratio utilizing this projection approach was derived (A bramovich 
and Johnson, |2008|), where a properly normalized undersampled likelihood ratio, 
denoted LR u (XT\Rmod) > analogous to the standard LR(R moc i) in Equation (17.11k 
can be presented as 


LR u (XT\R m od) = 


det (X^R m l od X T /T) exp T 


-1 T 


expTr (X^R7 n l od X T /T) 


T <M 


(7.26) 


or 


LRu(X T \R mo d ) = 


rjrmn(r,M) eig XR^Rm) exp (min (T, M)) 


»-i 


-1 T 


l J~- 


y 

j= l J QlgXRmbdRM) 




('7.27) 


where ei gj indicates the j th eigenvalue. 

The last version clearly demonstrates that for T > M, 

LR u (Xt I Rmod) = LR(Xt I Rmod) 


(7.28) 


where LR(Xt \ Rmod) is specified in Equation (17.111) . thus satisfying the transition 
requirement in Equation (17.181) . 


Now the introduced LR (17.261) has a very straightforward interpretation, 
testing the hypothesis 

H 0 : £{X^RdodX T /T}=MlT (7.29) 


versus 


Hi: £\X^RT n ' od Xr/T}^MI T 


(7.30) 


In fact, LR u (Xt \ Rmod) tests the quality of the prewhitened (singular) matrix C mo d, 
where 


Cmod ~ Xmod^rX^Rmod 


(7.31) 


Specifically, this test checks how close to unity are the nonzero eigenvalues of 

/V 

the prewhitened matrix C mo d/T . 

An undersampled scale-invariant “sphericity” test can be similarly intro- 
duced that tests how close the eigenvalues fij(Rm 0 dRM),j = 1,..., T are to being 
equal: 


LR^Hx T \R,„od) = 


-i T 


det (XjRnwcfifl 

^Tr (X^R^dX T ) 


T <M 


(7.32) 
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The introduced “general” (Equations 17.261 and 17.291) and “sphericity” (17.321) 
undersampled LRs are properly normalized and reach an ultimate value equal 
to one for the (unconditional) undersampled likelihood solution, thus satisfying 
the normalization requirement (17.171) . Both LRs stem from the undersampled 
likelihood function (17.91) and maximization of either LR is associated with the 
best “whitening” of the sample data. 

For T > M, when the sample data with probability of one spans the entire 
M xM subspace, the likelihood ratio LR u (XT\R m od) (17.261) and its associated 
sphericity test version LR{Xj \R m od) (17.321) (where for the general case, similar 
to Equation (17.271) . T must be replaced by min(T, M)) coincide with the standard 
LR (17.101) as well as the sphericity version (17.151) . 

Finally (and most important), it needs to be shown that, when the model R mo d 
in Equations (17.261) and (17.321) coincide with the actual covariance matrix of the 
input matrix of the input data, 


R mod =R (j ^£^X T x"\ (7.33) 

the p.d.f. for the likelihood ratios LR U {Xj |Ro) and LRu P) (Xj |Ro) does not depend 
on Rq and is specified by M and T only. In this case, the properties of the 

/V 

matrix Co 

C 0 =R~ 1 X t X^R ~ 1 =£ t Sj (7.34) 

are fully specified by St ~CA/Y(0,/m)» while T nonzero eigenvalues of the 
matrix Co are the same as the eigenvalues of the T xT matrix 

C t ~CW(M>T,T,I t ) (7.35) 


Therefore, for R mod = R 0 , 


LR u (X T \Ro) = 


det ( £^£j/T ) exp T 
expTr (£j£j/T) 


T <M, 4 i ~CA/'m(0,/t’) (7.36) 


while the sphericity test (17.321) may be presented as 


LR^p\x t \Ro) = 


-i T 


det (SjSt) 
yTv(S^S t ) 1T 


T <M 


(7.37) 


As with the matrix extension undersampled LR, distributions for LR u (Xj |Rq) 


and LR\j P) (X t\Rq) in th i s form can be derived similarly to IConsull (119691) and 


Nagarsenker and Pillail (I1973I) . 


and Abramovich et al. d2004 


(1197 

(12004 


where the real-valued case was considered, 


2007bl) . where modifications for the complex¬ 


valued case were introduced. Specifically, a serial representation of the p.d.f. for 
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LR u (Xt\Rq ) can be found via Mellin’s transform of the moment function 

-ih 


£{[LR u (X T \RoW} = e ^j^£ 


det £j-£t 


expTr (S^St/T) 


tm 


exp (Th)T 

(T + h) T ( M +^ 


i—r r (M -\~h~\- \ — j) 

x 0---— T<M 


7=1 


r (m + 1 — j) 


(7.38) 


where V(x) represents the Gamma function ( Gradshtevn and Rvzhik . 2000l) . 

Since expected likelihood (and maximum likelihood) can use any monotonic 
increasing function of the LR, and since the form in Equation (17.381) often results 
in numerically small values, the T th root of this LR can be equivalently used. In 
that case, the moments for the T 2 th root of LR U are given by 


£{[LRu(X t \Ro)]t2} = 


exp (h)T 


TM 


n 


r(M+f + w) 


{r+^) {MT+h) } = \ r( m +1 — j) 


T <M (7.39) 


Note that, as with the previously presented properly sampled case, these 
moment functions are dependent only on M and T, and therefore the p.d.f. 
obtained via the Mellin’s transform is also only dependent on M and T, 
establishing the “scenario-free” requirement given in Equation (17.191) . 

est (|7.15b with complex Gaussian data St was 


T ' 

test {7 

j( 

2004|) 


a 0 = LR i u s P ) (X T \Ro)T (7.40) 

then using a similar methodology as in lAbramovich et ah ( 2004 1. we get 


w(a) = C(T, M)a M ~ T G™ I a 0 


T 2 -1 T 2 - 2 

J 1 ? J V"? 


T 2 —T 


(7.41) 


where G a c ’j(-) is Meijer’s Gamma function ( Gradshtevn and Rvzhik . 2000b . with 


C(T, M) = (2tt) 1) T 2 ~™ 


T{TM) 


nLr(M-jTi) 


(7.42) 


The moments of {LR^(X 7 |Ro)} r are equal to 

T 


S{a h 0 } = T 


Th 


n 

7=1 


T (M —7 + 1 + h) 
T(M—j +1) 


V(TM) 


r(T(M + h))' 


T <M 


(7.43) 


Note again that a p.d.f. derived from these moments is dependent only on M and 
T, as required in Equation (17.19 ). 

Thus, it has been established that the undersampled LRs given in Equations 
(17.261) and (17.321) meet the requirements for an undersampled LR, and are 
therefore appropriate for use within the expected likelihood framework. The 
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application of this undersampled LR to a DO A estimation framework is explored 
later in this chapter. Meanwhile, the other alternative approach presented earlier 
for forming an undersampled likelihood ratio using matrix extension techniques 
is examined. 


7.3.2 A Matrix Extension Formulation of the 
Undersampled LR 


In addition to the original sample matrix, the transformed (whitened) sample 
matrix 


Co = 




E 

j =i 


XjX?R 0 


1 

2 


jLj~CM(0,Ro) 


(7.44) 


is described by t he anti-Wishart p.d.f. (de noted ACW(T < M , M , Im)) given in 
Equation (27) of Janik and Nowak ( 2003 ): 

M M 


^vr,M (det Cpr]^ 


\ T—M 


-TrC 0 


n n* 

1=T+1p=T+l 


/ del C[y]/ ; , \ 
\ detc m / 


(7.45) 


/V 

Here 8 is the Dirac delta function, K p ,m is a normalization constant, and C[T] 

/V 

is the upper left-hand T xT submatrix of the original matrix Cq: 



C\T\ 

* 

* 

* 


(7.46) 


Furthermore, for each / ,p> T, the (7+1) x (T + 1) matrix C[T]i p in Equation 
(17.451) is generated by adjoining the /th row and the pth column of Co to Qrj: 

r ci p i 


C[T]lm — 


c m 


Ct p 



Qt 



(7.47) 


Noting that the first T rows or columns of a rank-E Co matrix (or any set of 
entries with the same number of real-valued degrees of freedom) uniquely specify 
the entire matrix Co with rank T, one would like to involve all these independent 
entries in the formation of the undersampled likelihood ratio. However, the “low- 
rank” covariance matrix Rq in Equation (17.201) that defines our admissible set 
of covariance matrices is described by a limited number of degrees of freedom 
(DOF) given by 

DOF(flo) = 1 + 2 Mm - m 2 (7.48) 


where (2 Mm — rrr ) is the number of DOFs that uniquely describe the rank m 
signal counterpart Rs of the matrix Rq. Thus, one can expect that consistent 
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(with SNR —>► oo) testing is possible when the number of degrees of freedom 
in the full-rank extension exceeds DOF(Ro)> even if some degrees of freedom 

A 

available in Co are not utilized. In fact, this statement is just another version of the 
well-known requirement on a sample support ( T>m ) for “low-rank” covariance 
matrices of the form given in Equation (17.201) . 

For m<T < M, rather than the first T rows or columns, consider a (2 T — 1)- 

/V 

wide band of the matrix Co (i.e., elements c q , \i—j\<T—l). This band contains 
almost the same number of degrees of freedom as the original matrix (in fact, 
only (M — T) fewer). Since the band no longer uniquely specifies the original 

/V 


matrix Co (which is degenerate with rank T), the elements outside the band may 
be reconstructed in a number of different ways. Thus, by giving up a small num- 

/V 

ber of degrees of freedom and no longer fully specifying Co, a series of possible 
extensions to the band matrix is opened up, including some nondegenerate com¬ 
pletions. In particular, the maximum entropy completion (i.e., the band extension 
with the maximal determinant) for ord er £, denoted C (/?) , is specified by the 
Dym- Gohberg band extension method ( Dvm and Gohbergl 1 1981 : Woerdeman, 


19891) . 


In lGrone et al.l (11984) and lWoerdemanl (119891) it was proven that among all 
band extensions, the Dym-Gohberg extension has the maximal determinant (i.e., 
maximal entropy), given by 


M 

det[ C {p) Y l = ]\ eT q C q - l e q (7.49) 

q= 1 

where e q is a column vector of length (L(q) — q + 1) with a single unity entry at 

/V 

position q (where L(q) = min{M, q -\-p}) and C q is the (L(q) — q + 1) x (L{q) - 

/V 

q-\- 1) Hermitian central block matrix in Cq with 


/V 



c q,q 


_ CL (q),q 


c q,L(q) 


c L(q),L(q)_ 


(7.50) 


for q = 1,..., M, p < T — 1. 

The Dym-Gohberg band extension method applied to the rank-deficient sam- 

/V 

pie matrix C mo d transforms this matrix into a positive definite (p.o.d) Hermitian 
matrix C^ with exactly the same elements as the sample matrix C mo d within 
the (2 p+ l)-wide diagonal band. 

Moreover, this p.d. matrix C^ is uniquely specified by all different 
(p+ 1) x (/?+l) central block matrices in C mo d, and the only necessary and 
sufficient condition for such transformations to exist is the positive definite- 

/V /V 

ness of all (p+ 1) x (p+ 1) submatrices C q in C mo d. Let m <p < T — 1, when 
DOFCRs) < DOF(C^). Now the following likelihood ratio LR(Xt \Rmod) can 
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be introduced for the undersampled scenario: 


LR ( /\x T \R mod ) = 



/ i „ i \ (p) i 


det 

\ R mod RMR modJ 

exp M 


exp{Tr R\j R } 


M 


<1 


(7.51) 


since Tr R]\/[Rmod — Tr 


R mod R M R mod ) 


\(P)' 


1 

2 


Here 1 R mod R M R m od) 


1 \ ( P) ~ , , 

2 \ _ C (p) j s 


the Dym-Gohberg p-band transformation of the matrix C mo d. Note that, in fact, 

the LR U (Xr\R m od) calculation does not require actual reconstruction of the 
Dym-Gohberg extension, since its determinant is explicitly calculated via block 
matrix C q in Equation (17.491) . In this regard, this likelihood ratio may be treated as 

/V 

a test on £{C q } = lLq-q +\» simultaneously for all q, which for m <p is consistent 

/V 

with the original testing problem £{C mo d} =Im- 

For R mod =Ro, C mod = C 0 , where C 0 ~ ACW(T < AT, AT, I M ). That is, C 0 is 
now described by the scenario-free anti-Wishart complex distribution, specified 
by T and M only. Therefore, the p.d.f. for the LR U (Ro) does not depend on 
scenario, and can be specified by no more than the parameters M, T, and p , 
establishing compliance with the desired invariance property (17.191) . which is 
critically important in the expected likelihood methodology. 

The Dym-Gohberg band extension utilized in this likelihood ratio is com¬ 


putational 
Gohberg ( 


v straightforward to generate, and is described in detail in Dvm and 


198 lh and I Johnson and Abramovichl (120071 ). The final matrix 


produced by applying the band extension has the property that the original band 
entries are preserved and the inverse has z eroes outside the 2/7 +1 band, as 
expected for maximum-entropy completions (iLev-Ari et allfl989h . 

As with Equation (17.261) . the LR in Equation (17.511) effectively extends LR- 
based hypothesis-testing techniques into the important undersampled domain, but 
in a manner very different from the subspace projection approach given in Section 
17.3.11 Of course, in addition to the undersampled likelihood ratios derived pre¬ 
viously, a number of alternative hypothesis tests for the circumstance where the 
sample covariance matrix is sing ular have been proposed in the statistical math- 

' and Wol j ([2 ' 


ematics literature. For exa mple, Ledoi 


20021) recently re-examined 


hypothesis tests proposed bv l John ( 1973 ) and Nagao (1973), but under the asymp¬ 
totic framework used in the Random Matrix theory. Such tests require some 
adjustment to be used within ex pected likelihood applications, as discussed in 
Abramovich and Johnsonl (120081) . where they were shown to have properties quite 
similar to the undersampled LRs given in Sections [7.3.1l and l7.3.21 

Having now established the tools necessary to apply expected likelihood 
in both the properly sampled and undersampled regimes, we now turn to 
applications of the technique. 
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7.4 USE OF EXPECTED LIKELIHOOD IN THE MUSIC 
THRESHOLD REGION 

The expected likelihood technique just outlined has been applied to the problem 
of assessing whether a MUSIC-generated estimate is an ‘‘outlier” when operat¬ 
ing in a multiple-source scenario at SNRs and/or sample supp ort levels within 
the threshold region for MUSIC (lAbramovich et all l2007cr) . That investiga¬ 
tion (conducted primarily by Monte Carlo simulations) demonstrated a “gap” in 
the minimum sample support and/or SNR between the MUSIC-specific and ML- 
intrinsic threshold conditions. In fact, it was demonstrated that for the considered 
multiple-source scenarios, MLE outlier production (or “breakdown”) occurs at a 
significantly lower SNR than for MUSIC. This gap allows LR-based techniques 
such as information theoretic criteria (ITC) and expected likelihood to be applied 
to prediction of MUSIC breakdown, since the likelihood principle continues to 
opera te robustly at those s o urce SN Rs where MUSIC outlier production is active. 

In lAbramovich et al.1 (l2007cr) . the sample size T was assumed to exceed 
the dimension M of the adaptive filter (antenna array), so that the con¬ 
ventional (nondegenerate) ML estimate of the covariance matrix exists and 
drives the LR to its ultimate limit of unity. Yet, as just discussed, for many 
practical applications in the low sample support regime, i.i.d. samples are 
not available in sufficient quantity to approach this conventional (Janik and 


20031) sample support condition, nor is the condition necessary for 


Nowak, 

detection/estimation to occur. In fact, for sufficiently strong sources, it is 
only necessary for the number of samples to exceed the number of sources 
m (i.e., T>m rather than T >M ) for MUSIC to produce reasonably accu¬ 
rate DOA estimates. For fully correlated (coherent) sources, the minimum 
number of samples is even less demanding, since only an additive noise com¬ 
ponent and a complex amplitude of the compound sig nal snapsho t chan ges 
from sample to sample. Spatial averaging, suggested in I Shan et al.l dl985l) to 
resolve fully correlated sources in uniform linear antenna arrays, can in fact 
operate with just a single snapshot. Therefore, it is important to investigate 
the undersampled (anti-Wishart) sample support condition (T <M) for these 
problems. 

With the use of undersampled likelihood ratios LR U such as those introduced 
earlier (Equations 17.261 and 17.511) . the expected likelihood methodology can be 
extended to these cases. This methodology relies on the fact that the p.d.f. of 
LR(Rq) or LR u (Rq) (where Ro represents the true (unknown) covariance matrix) 
depends only on the dimension of the array M and the number of snapshots 
T, rather than on any specific scenario parameters. A threshold LR value can 
therefore be precomputed and applied to any scenario that meets the underlying 
assumptions of i.i.d. Gaussian observations. If the likelihood of any given set of 
estimates exceeds the chosen threshold (set to a value statistically related to the 
likelihood ratio associated with the true model), then the covariance model is 
considered acceptable. Application of this methodology to the DOA estimation 
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of a finite number of sources m fewer than the dimension of the array M can be 
reiterated as follows. 

A finite subspace model R mo d(m) can be parameterized as a summation of m 
(not necessarily independent) plane wave sources, as shown in Equation (17.211) . 
The problem can then be stated in this way: Given a set T of i.i.d. Gaussian 
data Xt e CA/Y(0, Ro), how does one find the optimal (in an expected likelihood 
sense) parametric covariance matrix model R mo d(m ), where m (the number of 
sources, say) is an arbitrary integer? 

The expected likelihood methodology suggests that the optimum order m and 
parameters £lf n used in constructing the model R(m) should be found as 


h* = arg{Prob [LR U (X T | R 0 ) <h]= P FA ] 


h 


m = arg mm < 

m 


max LR u (X T \R(Q m )) >h 


* 


(7.52) 

(7.53) 


=argmaxL/?„(X7’|/?(^*)) >h* (7.54) 


Here P FA <$C 1 is the probability that the LR produced by the true covariance 
matrix falls below the threshold value h* (probability of “false alarm”). Since 
the p.d.f. of the undersampled LR U (X F |Ro) does not depend on Rq, this threshold 
h* can be precalculated for any given P FA , M, and T. 

Even for estimation only (i.e., no detection because m is known a priori), the 
expected likelihood methodology still suggests finding the model parameters as 


Q m = arg 


max LR u (X T \R mod (m)) > h 


* 


(7.55) 


The key element of the suggested detection/estimation technique from Equa¬ 
tions (17.521) through (17.551) is the search for an appropriate solution, which in 
terms of LR is statistically as likely as the actual covariance matrix, and contin¬ 
ues until such a solution (which exists with a probability greater than 1 —Pfa) 
is found. Naturally, no claims regarding global optimality of such a solution can 
be made, but asymptotically, with T —> oo, the solution found in Equation (17.541) 
or (17.551) is associated with the maximum likelihood estimate, as the likelihood 
objective function becomes “narrower” with T —> oo. 

This principle elucidates the important phenomenon in which the entire 
expected likelihood and, more important, the ML estimation approach itself 
“breaks down.” Indeed, if the global LR U extremum exists for a severely erro- 
neous parameter set Q m (i.e., far away from the local extremum in the vicinity 
of Rq), then the ML principle is no longer adequate. Obviously, this may hap¬ 
pen only in the threshold area for MLE, and, as mentioned earlier, this remains 
an active area of research where a number of investigators are trying to spec¬ 
ify the conditions (in terms of SNR, sample support, or source separation, 
say) under which the probability of such “ML breakdown” reaches a certain 
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level Athlev . 2005 : Richmond . 2006b . Later in this chapter, some additional 
insight into this issue will be provided. 

For now, the techniques in Equations (17.52b through (17.54b rely on the fact 
that some estimation method-specific threshold conditions (such as those expe¬ 
rienced by spectral MUSIC) do not coincide with the ultimate ML breakdown 
threshold conditions, and therefore can be detected via the threshold derived in 
Equation (17.52b . In this case, a set of (DOA) estimates 0 A within Q^i containing 
an outlier creates a covariance matrix model R mo d(fii) the LR of which does 
not statistically approach LR u (Xt |Ro)- Determining that the DOA estimation set 

led “M USIC performance breakdown prediction” in 


contains an outlier is ca 


Abramovich et al.1 (12004 


2007cb . 


Practical attempts to enhance the estimation set and bring the LR u (Xj\ 
Rmod(™)) beyond the threshold can le ad to rectification of the outlier (referred 
to as “performance breakdown cure”: lHawkes et al.ll200lb if the algorithm is 
not yet operating in the ML threshold area. Particular details of this technique, 
applied for MUSIC threshold performance enhancement in the threshold area, 
are illustrated in Section 17.4.11 where the efficiency of expected likelihood for 


rectifying outliers caused by MUSIC threshold conditions (due to lack of SNR 
and/or sample support) is demonstrated. The main difference in the next sec¬ 
tion relative to previous studies is the presence of undersampled conditions and 
therefore the likelihood ratios used. 


7.4.1 Monte Carlo Simulation Results 


In this section, MUSIC and MLE performance in the threshold region (of param¬ 
eters) that spans the range from “proper” MUSIC behavior (no outliers) to MLE 
complete “performance breakdown” is examined. To accomplish this, an under¬ 
sampled scenario is considered with an M = 10-element uniform linear array 
(ULA), T = 6 samples, array element spacing of d/X = 0.5, and m = 3 inde¬ 
pendent equal-power Gaussian sources with a per-source input SNR of 20 dB 
(stochastic source model) located at azimuth angles 

0 m = {arcsin(—0.40); 0; arcsin(0.06)} (7.56) 


The covariance matrix Rq for this mixture is 





7=1 




(7.57) 


where the noise power is cr^ = 1; source SNR is given by crj / a q ; and S(0) is the 
DOA 0-dependent M-variate “steering” (antenna manifold) vector, normalized 
to a norm of unity. 

The number of sources (m = 3) in the Monte Carlo simulations is assumed to 
be known a priori. This is not a particularly onerous assumption in this case. The 
scenario is specifically chosen at an SNR where significant MUSIC breakdown 
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is occurring, yet the likelihood principle (which is at 

the heart of trad 

itional 

information theoretic criteria order estimation; 

Akaike. 

1974; 

Rissanen. 

1978; 


Schwarz . 1 197 8 ) is still robust and therefore a cross-check using the maximum a 


posteriori probability (MAP) information theoretic criteria, 


m = arg mm 

m 


-lo gLR(R mod (m)) + -m\ogN 


(7.58) 


produces an order estimate that agrees with the actual number of independent 
sources (three in this case) for every one of the 10 3 trials conducted during the 
cross-check. 

The MUSIC algorithm is implemented as usual by selecting the m largest 
maxima of the MUSIC pseudo-spectra formed over the noise subspace eigen¬ 
vectors ey: 


(p 1 (Rm , 0) = 


S u (0) 



-1 

S(0) 


(7.59) 


A covariance matrix model R mo d can then be constructed using the known order 
and the MUSIC DO A estimates: 


Rmusic = 


m 


J2^S(Oj)S H 0j) 

7=1 


+ <^0 


(7.60) 


where th e noise power is a 3 ; a 2 is the source power estimated via the method- 

or alternatively (in the case of 
Mestre et al. ( 2007 ): and S(6) is 


ology in lOttersten et al. ( 1993 ) (as done here 
closely spaced sources) by the method given in 

/V 

the MUSIC DO A estimate 0-dependent M -variate steering (antenna manifold) 
vector. 

In the Gaussian case, MLE is (theoretically) obtained by the sele ction of the 
single largest maxima of the multivariate likelihood function (LF) (lAnderson . 
19581) : 


£[Rmod(Q)] = 


exp [ Tr Rmod (£2)Rm] 
7X m det R mo d(Q) 


(7.61) 


where £2 represents the parameter power (a 3 ) and angle of arrival (0 m ) for the m 
sources. For a single source, MFE may be implemented via a one-dimensional 
search (similar to MUSIC) over possible directions of arrival (i.e., the tradi¬ 
tional matched filter (beamformer) output), making comparison with the MUSIC 
threshold condition straightforward. For multiple sources, however, analysis of 
MFE threshold conditions is not as simple, primarily because the globally optimal 
ME solution often cannot be guaranteed in practice. Therefore, an MFE proxy 
algorithm is implemented utilizing the expected likelihood approach. To do so 
on a practical basis in the low sample support regime (and in particular the under¬ 
sampled case), rather than the FF given in Equation (17.611) . the FRs introduced 
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in Equations (17.261) and (17.51b (and repeated here) with scenario-invariant 
p.d.f. properties are used. 

-i T 


LR u (XT\Rmod) = 


det (XjR m ' od 'X T /T)expT 
cxpTr (X Rjnod-^T /T) 


T <M 


(7.62) 


LR[P\X T \R m od) = 



”/ -i - -i \ (p)~ 


det 

\ K mod KMK mod) 

exp M 


exp{Tr RM^mbd} 


M 


<1 


(7.63) 


After deriving a threshold by the method given in Equation (17.52b . the 
expected likelihood “predict and cure” procedure can be detailed as a multistep 
procedure as follows: 


Step 1. Find those trials where the DOA estimates (produced by MUSIC) and 
associated source-power estimates generate a covariance matrix model Rmusic 
(17.60b that does not exceed the LR U or LR[^ threshold in Equation (17.52b . 
Otherwise, retain the trial results as the final estimates. 


Step 2. Identify the particular DOA estimate within the set of m MUSIC- 
produced DOA estimates that is an outlier. This step relies on the fact that, 
statistically at least, an outlier estimate will not contribute as significantly to 

I nl 

the LR U or LRU value as a nonoutlier (as long as the scenario being dealt 
with is identifiable]). The assumption here is that, since the scenario is far from 
the ML breakdown condition, only close proximity to the true model Rq may 

produce correspondingly high LR u (X t \Rmusic) or LR,/ ( X t \Rmusic ) values. 
Therefore, the source in the model Rmusic ( m ) that can be deleted to create a 
model Rmusic (m— 1) with the highest residual LR U or LR u l is treated as an 
outlier. This step requires only relatively minor computations to construct m 
separate covariance matrices, based on the remaining (m— 1) DOA estimates. 

Step 3. Provide outlier “replacement.” Here, instead of the outlier identified 
at step 2, the source with a DOA estimate that maximally contributes to the 

I nl 

LR u (XT\R m od) or LRu (X T \Rmod) is sought. A one-dimensional search (similar 
to MUSIC) across possible DO As is suggested to find this maximum. While 
the increase in computation relative to classical MUSIC is large, the process of 
calculating the LR U or LR u l values along a single variable is still much more 
tractable computationally compared to a (2m + l)-variate search, as demanded 
by an ML global search routine. 


2. Identifiable scenarios are ones in which a particular source w avefront can be unambiguously and 
uni quely associated with a sin gle DOA. See Stuart and Ordlll 991 ) for a definition of nonidentifiability 
and lAbramovich etah < 19991) for a discussion of identifiability across the combination of scenario 


and array. 
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Step 4. This step, in most cases, concludes the routine and considers “final 
refinement” of the estimate set. Here any gradient-type local optimization routine 
(Gauss-Newton, for example) algorithm can be used to deliver a solution. The 
specific method used in this case is the Neder-Mead simplex method ILagarias 


et al.. 119981) provided via the MATLAB routine fminsearch. This optimiza¬ 
tion can be conducted both on the rectified trials from step 3 and on trials that 
exceeded the LR threshold imposed in step 1 (if improvement in nonthreshold 
region accuracy is desired). This final step implements in full the expected like¬ 
lihood estimation approach fEquation l7.551) and asymptotically (T -> oo) allows 
us to treat these estimates as the ML estimates. For finite-sample support, such 
a solution can only be defined as appropriate (in terms of Equation 17.551) rather 
than globally optimal. Practically, though, this procedure may be terminated even 
after step 3, if the LR U or LR U threshold is exceeded. 


Step 5. If the original set includes more than a single outlier and, as a result, 
the threshold is not exceeded after step 4, the procedure can be repeated until 
the threshold is exceeded. One of the obvious drawbacks of this step is that with 
probability Pfa (at most), an already truly ML solution may be involved, and 
therefore this procedure can never reach the threshold and must be terminated at 
some stage. Because our interest is in robust comparison of MLE and MUSIC in 
the threshold region rather than in an algorithm that must deliver an estimate at 
every trial, in cases where the threshold is not reached on the first iteration, we 
treat the result as a failure to find a global maxima and discard the result rather 
than iterate the procedure until the threshold is exceeded. 


First, the performance of the expected likelihood approach in “predicting” 
MUSIC performance breakdown in step 1 is examined. The input source SNR 
(20 dB) for Equation (17.571) has been chosen to result in approximately 50% of 
the MUSIC estimates having an outlier (specifically 42.6% in 10 6 Monte Carlo 
trials). The distribution of the MUSIC-generated DO A errors (for the two closely 
spaced sources) illustrated in Figure [7T] clearly demonstrates the existence of 
outliers with errors that exceed 20°, while virtually all the remaining “proper” 
MUSIC trials exhibit DOA errors under 2°. So in the tabled results that follow 
(Table|7T|), outliers are defined as those estimates with clairvoyantly determined 
angular errors that exceed 2°. The percentage of trials that do not contain outliers 
(“good” trials) or contain at least one outlier (“bad” trials) are shown in the Truth 

f n) 

column, along with the mean LR U or LRh for each good and bad population. 
Although a clairvoyant analysis is employed to verify the results, the outlier 
detection and cure procedure that generates the results in the final columns does 
not rely on any clairvoyant knowledge. 

In Figure [721 with the M— 10 scenario (17.561) . the p.d.f. of the maximum 
(over the three DOA estimates) errors for the 5742 good and 4258 bad trials is 
introduced. These distributions demonstrate that in Monte Carlo simulations with 
this scenario, problems in reliably identifying trials that contain an outlier are not 
experienced. In Figures I72f a) and l7.2IT >) for the identified subsets of good and 
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Angle Distribution of Good/Bad Scenarios 



DOA Offset of Observed versus True DOA (degrees) 


FIGURE 7.1 Histogram of angle errors for sources (02 =0,9^= arcsin(0.06)) with input SNR = 
20 dB. Source: © 2008 IEEE, with permission. 


TABLE 7.1 Outlier Detection Statistics with Practical Threshold 


Outlier Mean P FA = '10 3 Mean P FA = '10 3 

Step Detected Truth LR U a = 0.363 LRa = 0.065 


Breakdown 


No 


57.4% 0.6288 


57.2% 


0.2451 


57.4% 


Prediction 


Yes 


42.6% 0.0008 


42.8% 


0.0015 


42.6% 


X 10 4 


LR U (X T \ R M us, c ) (T = 6, M= 10) 



(a) 


LR^ (X T I R music 
X 10 4 


) (T= 6, M — 10) 


(b) 



FIGURE 7.2 Likelihood ratio histograms with and without outliers: (a) LR u {Xj\R mo( ^)\ (b) 
LR^\Xj\R mo ^). Source: © 2008 IEEE, with permission. 
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bad trials, sample p.d.f.s for the derived LR U (Xj \ Rmusic) and LRu P) (Xj \R music) 
are illustrated. 

While the exact form of each of the two types of undersampled likelihood 
ratios differ, they both are very sensitive to the covariance matrix model mismatch 
caused by DO A estimation outliers. As a result, the likelihood values associated 
with the good and bad trials are well separated, making identification of bad 
trials containing outliers quite reliable by all considered tests. Efficiency of such 
outlier prediction in the absence of clairvoyant knowledge is illustrated below 

for the test LR u (Xr\R music) < 17.621 ) and LR^ P) (Xt\R music) < 17.631 ). 

For the expected likelihood methodology to be used, an LR threshold must 
first be derived based on the p.d.f. of the LR for the true model. For expediency, 
rather than compute the p.d.f. analytically, the scenario-invariant p.d.f. of the tests 

LR u (Xt\Ro) and LR^ p \Xt\Ro) were generated using 10 6 Monte Carlo runs for 
the a priori known values M = 10, T = 6, m = 3 associated with the scenario in 
Equation (17.561) . Examination of the resultant distribution indicated that for a 
statistical approach using a false alarm rate of Pfa = 10 -3 , the threshold h * in 
Equation (17.521) needs to be set to 0.363 and 0.065 for the two likelihood ratios, 
respectively. Table 17.11 contains the results of the outlier prediction and shows 
that of the 57.4% of the 10 6 trials that were good, the expected likelihood test 
was able to identify 57.2% in the case of LR U and 57.4% in the case of LR u l , 
meaning that <0.2% of the trials were erroneously identified as bad, a value 
consistent with the considered false alarm rate of 10 -3 . 

More important, all 42.6% of the trials clairvoyantly identified as bad using a 
2° DOA error threshold were also properly identified by the statistical nonclair¬ 
voyant thresholds. Thus, the MUSIC-specific threshold phenomenon, even at a 
source separation resulting in a significant failure rate (> 40%), is observed to be 
sufficiently removed from the ultimate ML breakdown conditions to be reliably 
identified (predicted) by expected likelihood in the considered undersampled 
(T <M) scenarios. 

Let us now examine the ability of the expected likelihood methodology to 
support outlier “cure” in steps 2 through 4. The “predict and cure” routine does 
provide (as illustrated next) significant performance improvement in considered 
scenarios. However, special circumstances may be found where this approach 
fails, as with any ML “surrogate,” in scenarios where an accurate global ML 
maxima in the region of the true model still exists. To account for this, the 
algorithm has been designed to discard solutions in those circumstances (as 
detailed in step 5), so that a valid comparison between MUSIC and MLE in 
the threshold region can be made. This emphasizes the fact that the algorithm 
is being used to illustrate the value of solution performance assessment using 
expected likelihood, rather than being optimized to perform as an operational 
algorithm. While the algorithm has clear advantages in such an operational 
regime (including MUSIC-type computational costs in “easy” scenarios and 
a mechanism to decouple the multidimensional search for MLE to a series of 
one-dimensional searches), further necessary analysis of computational costs and 
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appropriate Pfa selection criteria to make the algorithm operationally relevant 
is not included here. 

The efficiency of this rectification technique can be demonstrated using the 
scenario given in Equation (17.561) . The results of the routine are presented in 
Table 17.21 It shows that in 99.2% to 99.3% of all trials, the algorithm succeeded 
in generating a scenario with an LR above the threshold h* = 0.363 or 0.065 cal¬ 
culated for the two different undersampled likelihood ratios with a Pfa fixed at 
10 -3 . At the same time, 99.2% of all trials resulting from the rectification routine 
did not contain an outlier, versus the original 57.4% produced by MUSIC. There¬ 
fore, the introduced technique produced solutions above the imposed threshold 
but still containing an outlier in 0.1% to 0.2% of the cases. 

In fact, the 0.1% to 0.2% of misidentified trials illustrates instances of the 
statistical “ML breakdown” phenomenon where erroneous solutions (derived via 
the surrogate ML rectification routine) still generated a sufficiently high (above- 
the-threshold) LR. According to Table 17.21 in 0.7% to 0.8% of all trials, neither 
LR could be driven above its respective threshold, but unlike the prediction step, 
all of these nonrectified trials remaining below the threshold still contained an 
outlier. 

Thus, in 99.2% of all trials, solutions without an outlier were obtained and, 
while in 0.7% to 0.8% the rectification procedure failed (for the reasons described 
in step 5 of the procedure), all of them still containing an outlier after the recti¬ 
fication process were properly identified as bad whereas only in 0.1% to 0.2% 
did the outlier remain undetected and therefore unrectified. Obviously, 0.1% to 
0.2% versus 42.6% of “broken-down” solutions means considerable performance 
improvement. 

One can note some differences between the two undersampled likeli¬ 
hood ratios given in Equations (17.621) and (17.631) . both in computation and in 
prediction/cure performance. In this scenario, the differences are essentially 
immaterial, but some qualitative differences were observed while using the like- 


TABLE 7.2 Outlier Predict and Cure Statistics Using LR u (Xt\Rmusic) 
and Practical Threshold 



Step 

Outlier 

Detected 

Truth 

Mean 

LR U 

Pfa = 10“ 3 
a = 0.363 

Mean 

lrT 

Pfa = 10“ 3 
a — 0.065 

1. Breakdown 

No 

57.4% 

0.6288 

57.2% 

0.2451 

57.4% 

Prediction 

Yes 

42.6% 

0.0008 

42.8% 

0.0015 

42.6% 

2/3. Outlier 

No 

99.2% 

0.6251 

99.2% 

0.2295 

99.0% 

Prediction/Cure 

Yes 

0.8% 

0.5111 

0.8% 

0.1175 

1.0% 

4. Final 

No 

99.2% 

0.6255 

99.3% 

0.2297 

99.2% 

Refinement 

Yes 

0.8% 

0.5111 

0.7% 

0.1175 

0.8% 
























Mean-Square Error (degree 2 ) 
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lihood ratios in other scenarios. For example, it was observed that the ability of 
the subspace projection-likelihood ratio given in Equation (17.621) to discrimi¬ 
nate between scenarios with and without outliers exceeded the performance of 
the Dym-Gohberg matrix extension-likelihood ratio given in Equation (17.631) at 
extremely low Pfa values. Also, the matrix extension-likelihood ratio appeared 
to be less sensitive to errors in estimation of the noise floor (a problem that occurs 
when the number of snapshots T is only slightly larger than the rank of the signal 
subspace Rs). But in most cases, performance was similar. 

Graphically, the performance of the expected likelihood approach can be fur¬ 
ther examined across a broader range of input SNRs. To do so, a slightly different 
(but still undersampled scenario) used bv a range of investigators (A bramovich 
et al.. l2QQ7al : lMestreU2QQ6al : ISteverl and VaccaroL 120071) is employed. The sce¬ 
nario uses an M — 20-element ULA, T = 15 samples, array element spacing 
of d /A = 0.5, and m — 4 independent equal-power Gaussian sources (stochastic 
source model) located at azimuth angles 


e m = {- 20°,-10°, 35°, 37°} 


(7.64) 


immersed in white noise, with various per-element source SNRs (ranging from 
— 15 to +25 dB, or set to specific SNRs for more detailed investigation). 

Figure 1731 a) shows the mean-square error (MSE), averaged over 300 trials 
for each 1-dB SNR increment, for DOA estimates of the two closely spaced 
sources (at 35° and 37°). It demonstrates the familiar threshold effect in MSE 
for the DOA estimation process, with the sudden degradation in DOA accuracy 
(due to outliers) as the SNR is decreased. The threshold point predicted for 
MUSIC of 20 dB is consistent with similar results shown in lMestrel (12006al) and 


M = 20, T = 15, 4 Sources (-20°, -10°, 35°, 37°) M = 20, T = 15, 4 Sources (-20°, -10°, 35°, 37°) 



FIGURE 7.3 Multiple-source estimation on a 20-element ULA with T = 15 samples for MUSIC, 
G-MUSIC, and MLE. (a) Mean-square error; (b) outlier production rate. The SNR breakpoint (thresh¬ 
old) decreases from around 20 dB for MUSIC to 17 dB for G-MUSIC, but is still dramatically greater than 
the MLE proxy (LR-PAC) threshold observed at around 0 dB. Source: © 2008 IEEE, with permission. 
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Steverl and Vaccarol (120071) . The MLE breakdown is demonstrated with the MLE 


proxy algorithm discussed earlier, using the undersampled likelihood ratio given 
in Equation (17.62b and “seeding” solutions produced by MUSIC (and therefore 
labeled here as “MLR-PAC” for MUSIC-seeded likelihood ratio “Predict and 
Cure”). Also shown is the stochastic Cramer-Rao bound (CRB) for the two 
sources at 35° and 37° (averaged together). 

Figure l7.3T a) also shows the threshold region performa nce of a n alter na- 
tive DOA estimation algorithm referred to as G-MUSIC. In lMestrd (l2006allbh . 


G-MUSIC was derived by Mestre, seeking an improvement in MUSIC’S thresh¬ 
old performance in undersampled scenarios by deriving a DOA estimator that is 
consistent in the Random Matrix theory asymptotic conditions, where both the 
array dimension M and the number of snapshots T grow without bound but at the 
same rate: 


M, T —> oo, M/T —> y < oo 


(7.65) 


One can observe from Figure \T3\ a) two important items for the G-MUSIC 
algorithm. First, the MLE proxy algorithm provides essentially the same perfor¬ 
mance independent of the seeding mechanism (i.e., the MLR-PAC and GLR-PAC 
curves use MUSIC and G-MUSIC DOA estimates as seeds, but still provide the 
same final MSE profile versus SNR), indicating reliable association of the results 
with true MLE performance in the threshold area, rather than performance spe¬ 
cific to the method used to generate the initial DOA estimates. Second, while 
there is a performance difference in the thre shold region be tween MUSIC and 
G-MUSIC (consistent with similar figures in Mestre . 2006a 0) indicating some 
improvement i 
tional MUSIC, 


threshold conditio 


a s for G-MUSIC with respect to conven- 


Mestre and Lagunasl (120081) noted, “It was rather disappointing 


to observe that the use of M, T-consistent estimates does not cure the break¬ 
down effect of subspace-based techniques (in MUSIC) and it merely moves to a 
lower SNR.” This indicates that G-MUSIC is not able to avoid the fundamental 
phenomenon that separates MUSIC from MLE breakdown conditions, which is 
surprising given that the G-MUSIC derivations and actual Monte Carlo simula¬ 
tions were conducted under conditions that “guarantee separation of the noise 

/V 

and first signal e igenvalue cluster of the asymptotic eigenvalue distribution of R” 
( Mestre . 2006bl) . Therefore, asymptotically the “subspace swap ” phenomenon 


to which the threshold effect in MUSIC estimation is attributed (IHawkes et al. 


2001 : Thomas et al. . 1995 : lTufts et al. . ll99ll) should have been precluded by 
these conditions, and yet G-MUSIC’s breakdown was regularly observed in the 
conducted Monte Carlo trials. 

Clearly the connections between “subspace swap” in MUSIC, G-MUSIC, 
and MLE performance breakdown, as well as the relevance of the GSA/RMT 
methodology for practically limited M and T values, needs to be clarified. These 
connections are explored in the next section. 
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7.5 SUBSPACE SWAP AND MUSIC PERFORMANCE 
BREAKDOWN 

The subspace swap phenomenon has often been treated as the sole apparent 
mechanism responsible for performan ce breakdown i n subs pace-based tech¬ 
niques. This phenomenon is specified jThomas et al. . 1995 ) as a case where 
the estimates of the noise subspace eigenvalues A m+ i = A m+ 2 = • • • = Am = op 
with increasing probability become larger than the estimates of the signal sub¬ 
space eigenvalues A m < A m _i < ... < X\. “More precisely, in such a case one 
or more pairs in the set {Ay, ey}, j = 1,..., m actually estimate n oise (subspace) 


eigenelements instead of signal elements” (IHawkes et al. 


200 11 ) . 


The fact that subspace swap is associated with MUSIC breakdown has 
been well demonstrated in the literature, with several (only partially) successful 
attempts un dertaken to ana l ytical ly specify the threshold conditions for a given 
scenario. In IHawkes et al J (120011) . for example, it was admitted that analytical 


predictions “grossly under-estimate probability of subspace swap in and below 
the threshold region.” This lack of complete success may be attributed to the 
traditional (M = constant, T —> oo) asymptotic perturbation eigendecomposition 
analysis adopted for these derivations. Actual breakdown is of course observed 
in finite M/T conditions, and the scenario in Equation (17.561) with T<M is quite 
far from classical asymptotic assumptions. 

To explore the conditions under which subspace algorithms break down and 
establish the potential relationship with subspace swaps, it is convenient to intro¬ 
duce some basics notions of large random matrix analysis related to sample 
covariance matrices. In particular, it is important to establish the behavior of the 

/V 

sample eigenvalues and eigenvectors—namely, {Ay, ey}, j = 1,... ,M—as both 
the sample size T and the observation dimension M increase without bound at 
the same rate. This will be the objective of the following subsection. 


7.5.1 Fundamentals of Large-Sample Covariance Matrices 

The theory of large random matrices studies the behavior of the eigenvalues 
and eigenvectors as the dimensions of the matrices tend to infinity. It turns out 
that, for several random matrix models, the histogram of the eigenvalues con¬ 
verges to a fixed density function as the dimensions of the matrix grow without 
bound. To further understand this property, consider once more the MxM sample 
covariance matrix constructed from T snapshots: 



7=1 


Figure 17241 represents (solid line) the histogram of the eigenvalues of a particu- 
lar realization of Rm for different values of M in a situation where the number 
of samples per antenna was fixed to be equal to 5 (so that M/T = 0.2). In this 
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M = 30, T= 150 M= 50, 7=250 



FIGURE 7.4 Histogram of the eigenvalues of a particular realization of Rm for different values 
of M when five samples per antenna are available (M/T = 0.2). In this example, the true covariance 
matrix Rq always presents two eigenvalues {1,5} with equal multiplicity. 


particular example, the true covariance matrix Rq presented two eigenvalues 
{1,5} with equal multiplicity for all M. This means that the distribution of eigen¬ 
values of the true covariance matrix consisted of two Dirac delta functions with 
equal mass located at 1 and 5, and this distribution did not vary with M. We can 
observe that, as M,T grow large with a constant ratio, the empirical distribution 
of the eigenvalues of any realization of Rm converges to a nonrandom density 
(dotted line). It is also quite visible that each of the two true eigenvalues {1,5} 
shows up in the distribution of eigenvalues of Rm as separated eigenvalue clus¬ 
ters. For instance, in Figure l74l we can clearly distinguish two sample eigenvalue 
clusters, each one being located around the two true eigenvalues {1,5}. 

Before going into a more detailed characterization of this behavior, it seems 
quite reasonable to ask whether something can be said in a situation where 
the eigenvalue distribution of Rq is not kept constant for all M, and thus is 
allowed to vary with the matrix dimension. Figure [73]represents once again the 

/V 

evolution of a particular realization of the histogram of the eigenvalues of Rm as 
the matrix dimension M and the number of samples T increase without bound, 
with a constant ratio in the scenario considered in Figure 17.41 In this situation, 
however, the multiplicity of the highest true eigenvalue (equal to 5) was set to 
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M = 30, T= 150 



M = 50, T = 250 



0.4 


0.2 



0 


FIGURE 7.5 Histogram of the eigenvalues of a particular realization of Rm for different values 
of M when five samples per antenna are available (M/T = 0.2). In this example, the true covariance 
matrix Rm presents two eigenvalues {1,5}. Eigenvalue 5 has fixed multiplicity equal to 15 for all 
M, whereas eigenvalue 1 presents a multiplicity equal to M — 15. 


15 regardless of the matrix dimension, whereas the multiplicity of the smallest 
one (equal to 1) was fixed at M — 15, and therefore increased with the matrix 
dimension. This implies that the contribution of the highest eigenvalue to the 

/V 

eigenvalue distribution of Rm becomes negligible as the matrix dimension M 
grows to infinity, and therefore the density of eigenvalues of Rm converges to a 
single Dirac delta function centered at 1. 

Observe that this phenomenon is reflected in the evolution of the density of 

A 

eigenvalues of the sample covariance matrix Rm (solid line), in the sense that the 
sample eigenvalues linked to the highest true eigenvalue contribute with a dimin¬ 
ishing cluster in the asymptotic eigenvalue distribution in comparison with the 
cluster of sample eigenvalues associated with the smallest true eigenvalue. On the 
other hand, it is worth pointing out that, even in this situation, we can approximate 
the nonasymptotic sample eigenvalue distribution by an equivalent determinis¬ 
tic counterpart (dotted line), which describes quite closely the nonasymptotic 
sample eigenvalue distribution. In fact, it will be shown that the behavior of the 
eigenvalues of Rq as M —>► oo is not important as long as they remain bounded for 
all M. Thus, we will be able to provide a nonasymptotic description of the sam¬ 
ple eigenvalue distribution, which becomes a good approximation as M, T -> oo 
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regardless of the behavior of Ro. Let us now provide a more accurate description 
of this behavior in analytical terms. 

The study of the asymptotic behavior of the eigenvalues and eigenvectors of 

A /V 

Rm is usually carried out by examining the so-called resolvent of Rm , which is 
a matrix-valued complex function defined as 


Qm (z) = 


Rm — zIm 


where z can take any value outside the real axis—namely, z € C\9f. The resolvent 
is a very important function, because many interesting quantities related to the 

/V 

asymptotic behavior of Rm can be retrieved from it. For example, one may 
consider the normalized trace of the resolvent: 

(Rm — zJm J 

which is usually referred to as the Stieltjes transform of the sample eigenvalues. 

✓V /V 

This quantity is very strongly related to the eigenvalues of Rm , denoted X\ > 
A 2 > ... > Xm, in the sense that we can express 


m M (z) = — tr [Qm(z)] = — tr 
M M 


m M (z) = 


1 

M 


M 

E 


1 


_i Xfc z 


(7.66) 


and therefore we see that mM(z) is completely determined by the eigenvalues of 

/V 

Rm . But the true importance of mM(z) stems from its close relationship with the 

/V 

empirical distribution of the eigenvalues of Rm , denoted (A) and defined as 

where #{•} denotes the cardinality of a set. The function F^ m (X) returns, for 
each A, the percentage of sample eigenvalues that is lower than or equal to A. 
This function can be expressed in terms of mM(z) using the Stieltjes inversion 
formula: 




1 

= lim — 

y^0+ 71 



—oo 


\m[niM (x + iy)] dx 


(7.67) 


Thanks to this identity, we can study the properties of the empirical distribution 

/V 

of eigenvalues of Rm by studying the properties mm(z) or, equivalently, the 
normalized trace of the resolvent Qm(z ). 

Another interesting quantity that is strongly related to the resolvent is the 
MUSIC cost function: 












Subspace Swap and MUSIC Performance Breakdown 


where En contains the noise sample eigenvectors and S = S(0 ) is the steering 
vector evaluated at a particular angle of arrival. It turns out that, making use of 
the Cauchy integral formula, one can express 

<P (Rm , 9) = d — £ S M C z)dz 

c~ 

where 

s m (z) = S h Q m (z)S (7.69) 


and where C is a negatively oriented contour that encloses (with index 1) the 
noise sample eigenvalues on the complex plane. To see this, observe that we can 

express cp ( Rm , 0 ) as 



(7.70) 


and the function inside the complex integral is analytic over all the complex plane 

/V 

except for a set of simple poles at the eigenvalues {A^, k = 1.. .M}. Assuming 

/V 

that all sample eigenvalues are different, and since the pole located at z = ^k 
has a residue equal to — 5 H e£e]^5, we can obtain Equation (|7.68b as a s imple 
application of the Cauchy integral formula ( Marsden and Hoffmanl.ll987 ). 

It is therefore clear that we can study the behavior of quantities associated with 
the eigenvalues and eigenvectors of the sample covariance matrix by studying 
the behavior of the resolvent Qm(z ). In particular, the asymptotic behavior of the 
normalized trace of Qm(z) will describe the asymptotic correspondence between 
the true eigenvalues {A^} and the sample eigenvalues {A^}. On the other hand, 
the asymptotic form of a generic quadratic form of the resolvent (i.e., S u Qm(z)S 
for a generic column vector S ) will be very useful in studying the behavior of 
the MUSIC DO A estimation algorithm. 

Next, we provide a fundamental result that will be the basis for the asymp¬ 
totic characterization of these quantities. Before presenting the result, it is worth 


pointing out that, since Qm(z) depends on the sample covariance matrix Rm , it 
is a random quantity and so therefore are both its normalized trace itim(z) and its 
quadratic forms sm(z ). Theorem 7.1 establishes the fact that, when we let both 
the sample size T and the matrix dimension M increase without bound at the 
same rate, these two random quantities behave as two equivalent deterministic 
counterparts, which will be denoted as awm(z) and sm(z ), respectively. 


Theorem 7.1. Assume that 11511 = 1 and that the true covariance matrix has 
a uniformly bounded spectral norm for all M. Assume that we can express 

R m = R 1 ^ 2 UU U R ]^ 2 , where R^jf is the Hermitian positive definite square root of 
Rm and U has independent and identically distributed complex entries with inde¬ 
pendent real and imaginary parts having zero mean, variance 1 /2, and bounded 
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moments of higher order. Then, as M, T oo and M /T y, 0 < y< oo, the 
quantities rriM(z) and sm(z ) are almost surely close to two deterministic coun¬ 
terparts, %(z) and sm(z ), respectively, for all z G C + ={z G C: Im(z) > 0}, in the 
sense that 


\m M (z)-m M (z )I -> 0, |^m(z) -^m(z)I -> 0 


where 


M 


sm(z) = Y2 


S H e k e"S 


k =1 


A.*(l -k-kz»jm(z))-z 


(7.71) 


and where ihm(z) is the unique solution to the following equation: 


1 M 


1 


M ^(1 “ Y - YzmMiz)) ~z 


(7.72) 


located on the set [m e C : — (1 — y)/z + ym e C + }. 

The proof of this theorem has appeare d in different papers under different 
assumptions. For instance, in lGirkol ( 1998b this theorem appears under slightly 


weaker statistical assumptions, wh ere the existence of only 


ourth-order moments 


of the entries of U is required. In ISilversteinl 19951) and iBai et al.l d2007l) it is 
proven without the need of assuming that the entries of U have moments of order 


higher than 2. In iMestre ( 2006c ) a simpler alternative proof is given under the 
assumptions stated here. 

An important observation concerning the theorem is the fact that no asymp¬ 
totic expression is given for either niM (z) or sm (z) . Instead, these two random 
quantities are found to be equivalent to two deterministic counterparts, %(z) 
and sm(z ), which depend on the matrix dimension M. Thus, the result is valid 
regardless of how these quantities mM(z) and sm(z ) behave as M increases (in 
fact, these quantities do not need to converge at all as M —> oo). Thus, the theo¬ 
rem establishes a nonasymptotic description that is a valid approximation in the 
asymptotic regime when M and T increase without bound at the same rate. This 
in turn implies that our characterization of the eigenvalues and eigenvectors of 
Rm does not depend on how the eigenvalues and eigenvectors of Rm behave as 
M increases to infinity. 

Let us first investigate how the above result may be helpful in order to char- 
acterize the behavior of the eigenvalues of Rm • We recall first that in using the 
inverse Stieltjes transform in Equation (17.671) one can easily recover the empir- 
ical distribution of eigenvalues of Rm from the normalized trace of the resol¬ 
vent, rriM(z ). On the other hand, since rriMiz) is asymptotically close to the 
deterministic counterpart friMiz), one can prove that the empirical distribution 

/V 

of eigenvalues of Rm will be close to the equivalent distribution associated with 
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wiM (z) via the inverse Stieltjes transform: 

1 f 

Gm(^)= lim — / Im[mM(x + iy)]dx 
y^0+ TV J 
—oo 

Now one can easily see that the associated measure is absolutely continuous 
(except at zero, in the undersampled case), and therefore we can take derivatives 
in order to investigate the associated density, which can be expressed as 

1 

8mW= lim -Im[wM(^ + iy)] 

y^0+ IX 

This means that, by examining the form of the imaginary part of mm(z) when 
z tends to the real axis, we are able to establish the form of the density gMW, 
which is an asymptotically good approximation of the “random density” (i.e., 
histogram) of the sample eigenvalues. Once again, it is worth pointing out 
that gM(^) is a deterministic approximation of the random eigenvalue density 
regardless of how the true eigenvalues {X m } behave as M —>► oo. 

In Figure IT61 we represent the form of ^m(^) for different values of y in a 
scenario where the true covariance matrix presented three different eigenvalues, 
{15,5,1}, with equal multiplicities. It can readily be observed that, as y^O 
(meaning that the number of samples per antenna increases to infinity), the sample 
eigenvalues tend to group around the true eigenvalues, forming clusters. If y is 
sufficiently high, the density gM(^) consists of a single cluster, whereas if y 
goes to zero, the density gMiX) splits up into smaller clusters. If the number of 
samples per antenna is sufficiently high (i.e., y is sufficiently low), each true 
eigenvalue generates a single cluster in gM(k). 



FIGURE 7.6 Representation of the density gM(X) for different values of y where the true 
covariance matrix presents three equal-multiplicity eigenvalues {15,5,1}. 
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The number and position of the different clusters in gM(S) can be determined 
as follows. Let us consider the following equation in an unknown/: 


1 

M 



(7.73) 


In Figure 17771 we provide a graphical representation of the above equation. The 
solutions can be characterized by the intersection points between the function on 
the right side of Equation (17.73b . which does not depend on y, and a horizontal 
line at y~ 1 . It can be observed from Figure 17771 that the number of solutions to 
the Equation (17.73b is always even (counting multiplicities). Let us denote the 
number of such solutions as 2 Q, and the solutions themselves as 



where equalities hold only when there exist solutions with multiplicity 2. Now, 
it turns out that Q is the number of clusters in gM(S) and that the support of each 

of these Q clusters is 


•XT ,xt 


and where we have defined 


q=l...Q, where x = <J >(f ), x+ = <S >(/+), 


<*>(/)=/ 



Figure 17.81 a) provides a typical representation of the function O(/) above. It 
turns out that, as already observed in the figure, the function O (/) preserves the 


ordering in the values of/g .. ./j + . This means that Xq <Xq < ... <x 2 <x 2 < 


.+ 


.+ 


x 1 <v ] f , and therefore we can express the support of gu (A) (excluding Dirac 



fa fa h St 


FIGURE 7.7 Graphical representation of the solutions to Equation f7.73l . 
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(a) (b) 


FIGURE 7.8 Graphical representation of (a) transformation <!>(/) and (b) corresponding support 
of the clusters in (Z). 


deltas at zero when y> 1) as the union of Q closed intervals 



X Q’ X Q 


U... U [x ] 



] 


where each interval [x q , x+ ] corresponds to the transformation of an interval 
[ f ~,/+ ] obtained from the solutions to Equation (17.731) (see the right side of 
Figure EHJb)). 

By observing the function on the left side of Equation (17.73b in Figure 17771 
one can conclude that each eigenvalue of the true covariance matrix A.& can 
be associated with a particular cluster supported by gM(X), in the sense that 
there exists one interval [f~, / + ] for which X m c [f q , f+]. This way, one 
can associate k m with the cluster having support [x~,x+]. Now it is also clear 

from Figure 17771 that when y^O (and thus when the horizontal line at y~ l 
moves upward), the original clusters split into smaller ones, up to the point 
where each eigenvalue of the true covariance matrix will be associated with a 
unique cluster of gM(X). In other words, if the number of samples per antenna 
is sufficiently high (or, equivalently, y is sufficiently low), each distinct eigen¬ 
value of the true covariance matrix will be associated with only one cluster 
in g M (X). 

From this description, one can readily derive the minimum number of sam¬ 
ples per antenna required for a cluster associated with a particular eigenvalue 
to split from the rest of the asymptotic eigenvalue distribution. To do that, one 
only needs to find the values of the function on the right side of Equation (17,73 ) 
at the corresponding inflexion points. It is particularly interesting to charac¬ 
terize the minimum number of samples per antenna that guarantees that the 
cluster associated with the noise eigenvalue (i.e., the lowest eigenvalue) is sepa¬ 
rated from the clusters associated with the other (i.e., signal) eigenvalues. If we 
denote by the minimum inflexion point of the function on the right side of 
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guarantee asymptotic separability of signal and noise subspaces. 


Equation (17.731) (see Figure 17791) —namely, the lowest solution to the following 
equation in/*: 


1 

M 


M 


E 


4 

(A k~f *) 3 



(7.74) 


then we can establish that the noise cluster of gM(T-) is asymptotically separated 
from the rest of the eigenvalue distribution whenever 


T/M> 



min 



(7.75) 


Another crucial aspect that must be addressed refers to the concentration and 
separation properties of sample eigenvalues with respect to the deterministic 
density £m(A). It should be clear that the histogram of the eigenvalues of the 
sample covariance matrix is somehow close to the deterministic density gMW 
as M and T grow without bound at the same rate. However, this property is 
not useful in characterizing the behavior of a particular eigenvalue because the 
contribution of a single eigenvalue is lost in the asymptotic density. Thus, some 
stronger results are needed to characterize where a particular eigenvalue con¬ 
verges as M a nd T grow to infinity. Part of the answer to this was given in Bai 
and Silverstein dl998h . where it was established that, for sufficiently high M,T, 
all the eigenvalues of the sample covariance m atrix are located inside the support 
of —that is, Sm • This was later refined in Bai and Silverstein ( 19991) . where 

it was shown that separation is exact in the sense that, with probability one for 
all large M, T, the number of sample eigenvalues inside a particular cluster of 
gMW is equal to the number of true eigenvalues associated with that particular 
cluster (counting multiplicities). The only exception to this is y > 1, where T — M 
sample eigenvalues are equal to zero and thus do not contribute to the lowest 
cluster. 

An interesting consequence of the above result is the fact that, if we assume 
that the noise (lowest) eigenvalue cluster is separated from the signal cluster, 
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then the probability that a noise sample eigenvalue moves to the signal part of 
the spectrum is equal to zero for all sufficiently large M, T. In other words, if 
the noise subspace has dimension m, one can ensure that the m lowest sample 
eigenvalues are located in the cluster with support [x ^, ] for all large M, T, 

provided that the number of samples T per antenna element M is greater than 
the right side of Equation (17.75b . In this situation, one can ensure that signal and 
noise sample eigenvalues will be separated with probability one for all M, T that 
are sufficiently large. 


7.5.2 Using Large Random Matrix Theory to Describe 
MUSIC Behavior 


It turns out that whenever there is asymptotic separation between signal and noise 
subspaces, one can effectively describe the behav ior of the sampl e eigenvalues 
and eigenvectors using the technique presented in Mestrel (l2006bl) : Let Rm and 

/V 

Rm have the following eigendecomposition: 


M M 

Rm = = ^2 

k= I k= 1 


where 


^-m+l — • • • — ^M — 


0 


is the true noise eigenvalue, which has multiplicity M — m (the true noise eigen¬ 
vectors are taken to form a basis for the corresponding noise subspace). We 
also denote as E # (resp. E^) a matrix containing the true (resp. sample) noise 

/V 

eigenvectors, and as Es (resp. Es ) the matrix that contains the true (resp. sam¬ 
ple) signal eigenvectors. Now, in order to investigate the asymptotic behavior 
of the MUSIC algorithm it is sufficient to investigate the following quadra¬ 
tic form: 


<P 


(Rm, oj = 


TJ A A TJ 

= S*E N E$S 


which, as indicated in Equation (17.70b can be expressed as 


<P (rm , o} = 2- & S M C z)dz 

c- 


where sm (z) is defined in Equation (17.69b and where, as before, C~ is a negatively 
oriented simple contour enclosing only the noise sample eigenvalues. Using 
Theorem 17.11 we can see that the integrand is almost surely close to sm(z) as 
defined in Equation d7.71b . in the sense that \sm(z) — sm(z) \ —> 0 with probability 
one as M, T —> oo at the same rate. Assuming that noise and signal eigenvalue 
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cluster s in gM(z ) are separated, and obviating some technicalities tsee lMestre 
200 8bl. for further details), we can state that 


<P 




1 


27ri 


s M (z)dz 


c 


0 


almost surely, where we have basically replaced sm(z ) by its asymptotic deter¬ 
ministic equivalent, sm(z ). As a consequence, we only need to investigate the 
form of the above integral to find the form of the asymptotic MUSIC cost 
function. 

In order to do this, we insert the form of sm(z ) given in Equation (17.71b . so 
that 

M 


& s m ( z)dz = ^2 w music (k)S u e k ef S 
c~ k = l 


(7.76) 


where 


wmusic (k) = 


1 


1 


271 ij X k (\-y-yzm M (z))-z 
c~ 


dz 


We can already observe in Equation (17.76b that the MUSIC cost function is 
asymptotically equivalent to a linear combination of the true signal and noise 
subspaces. The weights wmusic( k) corresponding to the signal subspace, k<m , 
indicate that part of the content in the noise sample subspace asymptotically leaks 
into the true signal subspace. To determine the degree of leakage between the 
sample and the true subspaces, one needs to characterize the weights wmusic (k) 
in closed form. To do this, we define the complex function 

f M (z) = - - Z - - , (7.77) 


1 -Y-yzm M (z) 
so that we can express wmusic (k) as 

i n f M (z) 


wmusic (k) = 


2 jtiJ zh-f M (z) 
c~ 


dz 


(7.78) 


Inserting Equation (17.77b into Equation (17.72b one can readily see that /m(z) is 
the solution to 


z 


( ivi 

'-;E 


kk 


M£^h-f M (z) 


(7.79) 


On the other hand, taking derivatives of Equation d7.79b with respect to z on both 
sides, we obtain 


/ M / 

\ k =1 v 


k k 


k k -/m(z) 


(7.80) 
































Subspace Swap and MUSIC Performance Breakdown 


Now, inserting Equations (17.791) and (17.801) into Equation (17.781) we get to the 
key expression: 


w music (k) = £ 

c~ 


i _ v_ ( K \ 2 

1 M ^r=l\X r -f M { z )) 

i _ y. x^ M Ac 
1 M ^r= 1 A r -/ M C) 


1 

—Im(z) 


-fM(z)dz 


At this point, we make the nontrivial observation that, when z moves along the 
real axis, /m(z) describes a contour as illustrated in Figure [7T0l where we also 
show the corresponding complex conjugate curve. Observe from the figure that 
/m(z) together with its conjugate/^(z) can be concatenated to form a contou 


that en closes the true noise eigenvalue only. Thus, it is possible (see iMestre 


2008bl for further details) to use / as the complex integration variable in the 


above expression: 


w music (k) = 



where here again C~ is a negatively oriented contour enclosing only the true noise 
eigenvalue and having index 1 wi th respect to it. The resulting i ntegral is easily 
solved using the residue theorem ( Marsden and Hoffmanl 1987h . noting that the 
integrand has only two poles inside C ~: one located at the true noise eigenvalue 
A m+ i =... = Xm = 0q and one located at the value that nulls out the denominator, 
which corresponds to the minimum solution to the following equation (denoted 

Mmin)- 


1 

M 


E 



hr Mmin 



(7.81) 



FIGURE 7.10 Evolution of the curve described by fM (z) when z moves along the real axis, and 
the corresponding complex conjugate curve f^{z). 
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After some calculation, one obtains 


w music (k) = 





i 2 

hk~ (J Q 


<y, 


o 


M- min 


k r — <7 q h r M min 
Minin 


hjc M min 


k>m 
k < m 


(7.82) 


In conclusion, we can state that, assuming that the number of samples per 
antenna is sufficiently high so that the eigenvalue-splitting condition in Equa¬ 
tion (17.751 is fulfilled, as M, T -> oo at the same rate, the MUSIC cost function 

A TT A A TT 

<P(Rm, 0) = S' E N EjjS is asymptotically close to a deterministic counterpart 


( p(Rm , 0) in the sense that 


<P(Rm,@) ~ <p(Rm,0) 


0 with probability one, where 


<p(Rm , 0) = wmusic (M) S h E n E%S 

" -v-' 

Desired cost function 

(7.83) 

m 

+ ^2 w music (^)5' H e/ : e^ S 

k= l 

V ---' 

Leakage into the true signal sub space 


Thus, we see that the sample noise subspace can be asymptotically described 
as having a component on the true noise subspace plus an additional undesired 
component on the true signal subspace. When the number of samples per antenna 
increases (or, equivalently, when y —>► 0), we recover the desired cost function 
S h EnE^S. Indeed, observe from Equation (17.811) that /x m j n —> <7 q as y —> 0, and 
consequently 


wmusic (k) 



k > m 
k= \...m 


so that c p(Rm,0 ) —> S U E^E^S. For the general case where y> 0, we see that 

w music( k) >0 and therefore cp(Rm,0) is no longer a consistent estimator of 
S k EnE^S when M, T —> oo. 

It is worth pointing out that an equivalent result can be obtained for the 
signal eigenvalues using exactly the same procedure. Assume that we need to 

TT A /V jj /V 

characterize the asymptotic behavior of the quantity S EjEjS, where E \j contains 
the eigenvectors associated with an eigenvalue Xj (if Xj has multiplicity one, 

E/ = %■)• 

Theorem 7.2. Under the statistical assumptions just specified, and assuming that 
the eigenvalue cluster associated with Xj is separated from the rest of the eigen¬ 
value distribution gM(7), we can establish that (as M, T —> oo at the same rate) 




M 


72 w j (k) 


k =l 


S K Ck 


0 


(7.84) 
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where we have defined 


|_ 1 _ 'sr^ivi 

1 Ki ^ r= 


Wj(k)= 


M ( A ,j 

x r ^xS Xr ~ Xi 


H ) 

X r llj J 


X; 




Xr Xj Xk lij 


k . Xk — X } 


k : Xk 7 -Xj 


where Kj is the multiplicity of Xj and where now fij , j = 1... M are the solutions 
to Equation d7.81b repeated according to the multi plicity o f each X j. 

The proof of this general result can be found in lMestrel (12008br) . 


7.5.3 A Simplified Scenario: The Spiked Population Model 

The results just presented are quite powerful, but allow for little interpretation. 
The reason for that stems in part from the fact that the derived expressions, 
although asymptotically valid in the large M, T regime, are nonasymptotic, mean¬ 
ing that they depend on fixed values of M, T. Nothing has been assumed about 
how the eigenvalues and eigenvectors of Rm behave as M ^ oo, and thus the 
provided expressions are quite general but difficult to study. A potential simpli¬ 
fication of the above expressions comes from assuming a particular asymptotic 
behavior of the eigenvalues and eigenvectors of Rm : More specifical ly, the so- 
called “spiked population covariance matrix model” ( Johnstonel. 200 ll) describes 
the asymptotic behavior corresponding to the case where the signal subspace 
dimension becomes negligible compared to the observation dimension. This 
means that m remains fixed, whereas M -> oo. 

Therefore, this model is essentially an asymptotic particularization of the 
situation analyzed up to now, based on the simplifying assumption that the con¬ 
tribution of the signal subspace is negligible in the asymptotic regime, in the 
sense that only the dimension of the noise subspace scales up with the number 
of antenna elements, whereas the dimension of the signal subspace m remains 
fixed. The immediate consequence of this specific convergence behavior of the 
spectrum of Rm is the fact that, in the asymptotic regime (when M -> oo), the 
true covariance matrix consists of infinitely more noise eigenvalues than sig¬ 
nal eigenvalues. In other words, the signal subspace does not contribute to the 
asymptotic density of eigenvalues of Rm , which consists of a Dirac delta function 
centered at cTq. 

Let us analyze the implications of the spiked population model on the eigen¬ 
value separability conditions. To illustrate this, we consider a scenario in which 
the true covariance matrix presented three eigenvalues at {4,3,1}, where the 
first two eigenvalues (signal eigenvalues, 4 and 3) had multiplicity 1 regardless 
of the matrix dimension, whereas the multiplicity of the smallest eigenvalue 
(noise eigenvalue) was fixed at M — 2. In Figure [7AT] we represent the evolu¬ 


tion of the su 


3port of gM(A-) as a function of M. The support is described as 


Sm = [v 3 , U[v 2 ,*+]u [v 1 , v^], where each interval is associated with a 


different true eigenvalue. Now we can observe from Figure[7jj]that the support 
of the clusters associated with the signal eigenvalues shrink as M —> oo. This is 
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FIGURE 7.11 Evolution of asymptotic support as a function of M in a spiked population covari¬ 
ance matrix model (c = 0.1) with three eigenvalues {4, 3,1} with multiplicities {1, l,Af — 2}. The 
number of samples per antenna was fixed equal to 10 (y = 0.1). 


quite reasonable since the contribution of these eigenvalues to the global density 
becomes negligible as M -> oo because their multiplicity does not scale up with 
the matrix dimension. Thus, we can readily conclude that, in a spiked population 
model, there is always separation between the signal eigenvalue clusters. This is 
not necessarily the case for the separability between signal and noise subspaces, 
because the noise sample eigenvalue cluster does not disappear as M -> oo. Let 
us derive the actual condition that guarantees separability of the signal and noise 
subspaces. 

Under the spiked population simplification of the original model (which 
implies letting M oo for m fixed in the original formulas), the asymptotic 
subspace-splitting condition in Equation (17.751) becomes 

T/M > ( Xm ) =(~^—\ (7.85) 

which can also be expressed as 

(7.86) 

where we recall that y is the asymptotic inverse of the number of samples per 
antenna, (T/M)~ l . If the above condition holds, all the clusters are separated 
in the asymptotic eigenvalue distribution. Indeed, as previously shown we can 
ensure that the cluster associated with a particular signal eigenvalue is always 
separated from the cluster associated with another signal eigenvalue for suffi¬ 
ciently high M. On the other hand, the cluster associated with the noise subspace 
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does not disappear (because the multiplicity of the noise eigenvalue scales up 
with M), and thus we can guarantee that the noise eigenvalue is separated from 
the minimum signal eigenvalue X m as long as Equation (17.851) holds. 

Next let us investigate the behavior of the weights w music (k) under the spiked 
population covariance model. Note first that Equation (17.811) can equivalently be 
written as 


1 

M 


m 


X 


min 


+ (M — m) 


<x 


0 


^0 Minin 


1 

Y 


(7.87) 


By definition, Mmin < = • • • = A. w +i < k m < ... < X\. Thus, Xj — /x m i n can 

never go to zero for any 1 <j < m. Consequently, the first term of Equation 
(17.871) will go to zero as M —> oo for a fixed m , and fi m j n will converge to the 
solution of 


< 7\ 


0 


1 


Mmin Y 


(7.88) 


namely, 


Mmin ^ ^ y) 


(7.89) 


Consequently, for the spiked population model, we can express 


1 


k > m 


w music (k) = < 





\ 

^k~0Q J 


k < m 


Under the spiked population model, the MUSIC cost function is thus asymptot¬ 
ically equivalent to 


Xu 

m (k 2\2 

0) = S h E n E%S + yal V s H e k efs (7.90) 

i i i i / a o 


A similar simplification can be carried out for the expressions associated 
with the signal sample eigenvectors in Equation (17.841) . Assume for simplicity 
that Kj = 1, so that we are dealing with single-multiplicity signal eigenvectors. 
Expressing the weights Wj(k ), 1 <j<m as 




_JU_\ 

X r iij j 



Xk Xj 


(M — m) 


fji U i l j 

0 (°0 —( CT o —M/) 


B. 

I-k l-Lj 


k—j 
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and noting that for the spiked population model (see I Johnson et all 120081) 
fij —> Xj and 

yXj 


M (Xj - fij) 


1 _ v a o 
1 Y ai-h 


, i<j<m 


'0 


we obtain 


wj{k) 



i -y 



i -r 


Or 




k=j 


Thus, one can ensure that, under the spiked population covariance simpli¬ 
fication, and assuming that the separability condition (17.861) holds, one has 


(S U $j) 


i -y 


o , 


0 


a o~ x i 


i -y 


o, 


0 


<A-> 


(7.91) 


°l~ x i 


where S can be any deter ministic col umn vector with unit norm. This is precisely 
the result introduced by IPaull (120041) for the specific class of spiked population 


covariance matrices, where again a fixed limited number of eigenvalues is greater 
than the smallest eigenvalue, the multiplicity of which grows with the number 
of antennas. If S is replaced with an eigenvector of the true covariance matrix 
e£, then (under the spiked population covariance matrix model, and assuming 
asymptotic subspace separation) the projection of a sample eigenvector onto the 
linear space spanned by an eigenvector associated with a different eigenvalue 
converges to zero: 




i -y 


o, 


0 


a o~ x j 


i -y 


o, 


5j-k 


(7.92) 


0 


°0 ~ X J 


Additionally. IPaull (120041) studied the convergence of the sample eigenvector 
e 7 when there is no asymptotic separation between signal and noise subspaces: 
k m < ctq (1 + y/y). In particular, he established that, under the spiked population 
covariance matrix model, 


(efe,-) 


0 


(7.93) 


almost surely as M, T —>► oo at the same rate. (In fact, Paul proved this for 
real-valued Gaussian observations with a diagonal covariance matrix; we can 
conje cture that th e result is also valid for the observation model considered 
here. 


Paul 


2006 


(2004) admitted that a “crucial aspect of the work (Baik and Silver- 


Baik et al.l.l2005l) is the discovery of a phase transition phenomenon,” 


stein. 

which is clearly analogous to the subspace swap phenomenon known in the signal 
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processing literature for 20 years (iTufts et alll 1988b . Here it has been shown that 
the condition (17.751) . or the simplified condition for the spiked population matrix 
model (17.861) . which asymptotically prevents the phase transition phenomenon 
from occurring, is in fact the condition that guarantees the separability of the 
sig nal and nois e subsp aces in the asymptotic sample eigenvalue distribution. 


Tufts et al.l (1199II) stated that the threshold effect in subspace techniques 


(such as MUSIC) is associated with the probability that the measured data is bet¬ 
ter approximated by some components of the orthogonal subspace than by some 
components of the signal subspace. A narrow investigation of the relationship 
between these asymptotic phase transitions and the subspace swap would there¬ 
fore pinpoint the conditions under which the norm of the scalar product between 
the true and estimated eigenvectors in the signal subspace falls below 0.5. In 
this case, Tufts’s description of the subspace swap becomes clearly equivalent. 
Therefore, the “signal processing” subspace swap definition implies that, for the 
last signal eigenvector of the true covariance matrix e, 




^ E N E N e m > e m E S E f e 


m 


(7.94) 


or, equivalently, 


e m E sEfe m 


<0.5 


(7.95) 


That is, the signal eigenvector associated with the minimum signal eigenvalue 
(A m ) is better represented by the noise sample subspace than the signal sample 

TJ A A TT 

subspace. Accordingly, the behavior of the projection ejEsE^ej is of prime 
importance for this analysis. 


7.5.4 Simulation Analysis of MUSIC Breakdown 

There is an immediate temptation to associate the MUSIC-spe cific performance 
breakdown with the phase transition described bv IPaull (120041) and the subspace 
swap threshold (17.951) associated with e^EsEfe m . However, detailed analysis 
of the previously introduced MUSIC breakdown scenario demonstrates that the 
relationship is more complicated. Examining the eigendecomposition of the true 
covariance matrix associated with the scenario given in Equation (17.561) : 


r = 6,M=10,m = 3 

(of = a\ = a\)/al = 20-dB input SNR (7.96) 

6 m = {arcsin(—0.40); 0; arcsin(0.06)} (7.97) 


we see that the first three eigenvectors of the true covariance matrix R$ are 
equal to 


M = 1869, A 2 = 1002, k 3 = 132, 



= 3003 


(7.98) 
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For A3 = 132, Paul’s condition (17.861) evaluates as 


A 3 >1 + y/(M/T) (=1+71076 = 2.29 


(7.99) 


and is strongly satisfied. Therefore, according to Equation (17.921) . 


i-H 

l e m e m 


-4^ y/(l - 1.66/(131) 2 )/(1 +1.66/131) = 0.9937 (7.100) 


which still may be treated as ‘‘extraordinarily good” estimation accuracy f or the 
sampl e eigenvector e m . A threshold phenomenon in the sense described in [Paul 


(I2004D is therefore not expected. Yet, as shown earlier in Table 17.11 almost half 
the MUSIC-derived estimates in this scenario contain outliers. 

Therefore, quite subtle differences in the eigensubspace of the sample matrix 

/V 

Rm transform into catastrophic threshold behavior during the MUSIC transfor- 

/V 

mation (Rm —> Rmusic ) for this scenario. One has to conclude that breakdown in 
the MUSIC-produced covariance matrix model Rmusic for such multiple-source 
scenarios m ay occur much more readily than subspace swaps or phase transitions 
(IPaulL l2004ll in the sample covariance matrices rather than matrix Rm itself. 

The transition between the MUSIC-specific threshold phenomenon and ML 
performance breakdown can be illustrated by modifying the closely spaced 
sources’ separation to ever-smaller values. When the separation for the M— 10, 
T = 6 . in = 3 scena rio drops to arcsin(0.003) (close to the statistical resolution 
limit dSmithll2005r) . because the Cramer-Rao a ccuracy for isola ted 20-dB input 
SNR targets in this scenario is 0.0018 radians (Swingler, 1993)), MUSIC fails 
completely, with all trials producing outliers, while the statistical outlier pre¬ 
diction also fails regularly, with 57.7% of the trials still identified as good by 
comparison of the LR U with the threshold 0.363 calculated for a false alarm 
rate of 10 -3 . Interestingly enough, within these misidentified trials, 15% also 
exceed a “strict” trial-dependent threshold based on the LR associated with the 
clairvoyant model. That is, 


LR u (X t ,R M us IC )>LR(X t ,Ro ) (7.101) 

which means the accurate (not statistical) ML “breakdown” phenomenon has 
been reached, which is associated with the generation of a covariance matrix 
model Rmusic that contains a severely erroneous DOA estimate (an outlier) but 
is still considered more likely than the true covariance matrix by the LR test. 
Note that in this scenario, only a very small portion of the signal energy resides 
in the third eigenvector: 


A.i = 1996.5, A 2 = 1001, A 3 = 



(7.102) 
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which corresponds to a relatively small projection along the source directions of 
arrival 

efS(0j) 


^(Oj)Sj(0j) 


j =1,2,3 


(7.103) 


equal to 0.0014, 0.0475, and 0.0475, correspondingly. 

Therefore, the “threshold” condition (17.991) (i.e., A 3 ^ l + y/y) of the sub- 

/V 

space swap in the sample covariance matrix Rm (which is the unconstrained ML 
covariance matrix estimate and a sufficient statistic for any inference regarding 
this matrix in the complex Gaussian case) is associated with the condition of 
ultimate ML performance breakdown. 

Indeed, for the scenario with $3 =arcsin(0.003), which experiences 100% 
MUSIC breakdown, 57.7% statistical (17.551) ML breakdown, and 15% strict 
(17.1011) ML breakdown, the third true eigenvalue of 5.5 is close to the Paul 
condition (17.991) of 2.29. Compare this condition with the 42.7% MUSIC break¬ 
down and 0.1% ML breakdown experienced at 63 =arcsin(0.06) and A 3 = 132 
(see Equation 17.98 ). 

When the difference between the threshold conditions for the MUSIC- 

/V 

transformed model Rmusic and for the sample covariance matrix R itself 
is clearly demonstrated, the nature of the “gap” between MUSIC and ML 


breakdown threshold conditions observed in previous s 
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Abramovich, 120041) has another justification. Because of this “gap” between 
ML and MUSIC breakdown conditions, it comes as no surprise that erroneous 
models Rmusic involving closely spaced sources may be identified (predicted) 
and rectified (cured) as long as the threshold condition (17.991) is strongly 
satisfied. 

To further examine the large M, T asymptotic analytic predictions with 
respect to MUSIC breakdown, let us return to the four-source scenario (17.641) for 
the following four source SNR values: 


• Input SNR = 25 dB —> no MUSIC outliers 

• Input SNR = 14 dB -> -50% MUSIC outliers 

• Input SNR = 9dB —> almost 100% MUSIC outliers 

• Input SNR = OdB —> onset of ML breakdown 

First, in Figure 17421 for each of these four SNR values, the sample distribu- 

/V /V 

tions of the four eigenvalues Ai,..., A 4 in the signal subspace are separately 
shown, along with the distribution of all nonzero eigenvalues in the noise 
subspace. As expected, the separation between the noise and signal subspace 
eigenvalues decrease as the source SNR decreases, but one can see that even for 
the SNR of 9dB with practically 100% MUSIC breakdown, the nonzero noise 
subspace eigenvalues are still well separated from the minimal signal subspace 

/V 

eigenvalue A 4 . It is only at the lowest plotted SNR of 0 dB that significant over¬ 
lap between the noise and signal subspace eigenvalues is observed. The true 
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M = 20, T = 15, 6j = {-20°, -10°, 35°, 37°}, 
SNR = 25 dB 


M = 20, T = 15, Qj = {-20°, -10°, 35°, 37°}, 
SNR = 14 dB 


O 


<D 

Vh 


3 


O 

O 


o 




Eigenvalues (log scale) 


Eigenvalues (log scale) 


(a) 


(b) 


M = 20, T = 15, 0:= {-20°, -10°, 35°, 37°}, M = 20, T = 15, 0, = {-20°, -10°, 35°, 37°}, 


SNR = 9 dB 


] 

SNR = 0 dB 




(c) (d) 

FIGURE 7.12 Eigenvalue distributions for the scenario in Equation 47.641 : (a) 25-dB SNR—no 
MUSIC breakdown; (b) 14-dB SNR-50% MUSIC outliers; (c) 9-dB SNR-100% MUSIC out¬ 

liers; (d) 0-dB SNR—start of ML breakdown. Even in the presence of significant MUSIC breakdown 
for the scenario SNR of 9 and 14dB, the signal subspace eigenvalues remain well separated from 
the noise subspace eigenvalues. Source: © 2008 IEEE, with permission. 


eigenvalues of the underlying true covariance matrices (denoted Eig(M, SNR)) 
are 


Eig( 20,25) = [11896,7341,5271,794,1,..., 1] (7.104) 

Eig( 20,14) = [945,584,419,64,1,..., 1] (7.105) 

Eig(20, 9) = [300,185,133,21,1,..., 1] (7.106) 

Eig( 20,0) = [38.6,24.2,17.7, 3.5,!,...,!] (7.107) 


which means that the subspace-splitting condition Xj > 7.^(1 + ^/y) from 
Equation (17.86b . 


A 4 >lx(l + Vl33) = 2.15 


(7.108) 
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is satisfied for all four SNR values (although the 0-dB SNR case is marginal). The 
subspace-splitting condition given in Equation (17.751) can be computed for the 
transition from the signal subspace to the noise subspace, which is the splitting 
condition between the fourth and fifth eigenvalues. This gives values of 0.21, 
0.24, 0.29, and 1.08 for 25 dB, 14 dB, 9 dB, and 0 dB, respectively. This value is 
clearly less than T/M = 0.75 for all but the last SNR value. 

Based on these asymptotic splitting metrics only, one would conclude that 
the noise and signal subspace eigenvalues are distinct for all but the last case 
at 0-dB SNR, and therefore subspace techniques should operate robustly at the 
higher SNRs. Yet significant MUSIC breakdown occurs in the 9- and 14-dB SNR 
case. It is also important to establish that the finite M, T conditions examined 
here do not change dramatically asymptotically (in the sense of Equation 17. 651) . 
so in addition to the M = 20, T = 15 scenario considered above, let us examine 
the following three scenarios with an increased M and T dimension but with the 
ratio M/T held constant. 

• The original scenario where M = 20: 


T = 15; Q m = {-20°, -10°, 35°, 37°} SNR = 14dB 
Eig( 20,14) = [945,584,419,64,!,...,!] 


(7.109) 


• A 200-element array: 


T = 150; 0 m = {-20°, -10°, 35°, 35.2°} SNR = 4dB 
Eig(200, 4) = [941,508,498,65,!,...,!] 


(7.110) 


• A 400-element array with 


T = 300; 0 m = {-20°, -10°, 35°, 35.1°} SNR= 1 dB 
Eig(400, 1) = [943,508,500,66,1,..., 1] 


(7.111) 


Inter-source separations and SNR in the increased array size scenarios in 
Equations (17.1101) and (17.1111) have been chosen to produce essentially the same 
signal eigenvalues as per the original scenario in Equation (17.1091) with M = 20. 
All three eigenspectra have minimal signal subspace eigenvalues in the range 
of 64 to 66, which allows us to expect the same asymptotic behavior under 
the spiked population covariance model. In Figure I7T31 sample distributions of 
the projection EsEfe^, calculated for all three scenarios in Equations (17.1091) . 
(TeTToI) . and EH (with M/T = 1 .33 in all cases) are introduced. First, observe 
that, in full agreement with the results derived from large Random Matrix theory, 
the projection is converging asM,T^oo(M/r^ constant) to a nonstatistical 
deterministic value. Indeed, one can observe quite a consistent convergence of 

T T A A T T 

the sample distributions for EsEfe^ to a delta function, whether the scenario 
contains a MUSIC outlier or not. Furthermore, while the results are converging 
asymptotically, the mean values observed with a modest array dimension of 20 
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elements are already quite accurate (to within 0.5% of the mean values observed 
with 400 elements). 

To validate these asymptotic deterministic values, we point out that 

= 1 - $E N E%e 4 


T T A A TT 

where the convergence of the term e^E^E^e 4 has been characterized in Equation 
(17.831) for the general model and in Equation (17.901) for the specific model of the 
spiked covariance matrix limit. More specifically, if we focus on the spiked 
model simplification and replace S with e 4 in these expressions, one can state 
that 


efE s Efe 4 


l — y\ 


cr, 


0 


X4 Oq 


1 + K 


cr, 


0 


X 4 < 7 q 


(7.112) 


with probability one. This asymptotic expression is in fact the same as that 
presented in Equation (17.921) . but now the projection is onto the entire sample 
subspace (rather than a single eigenvector). This is because, as pointed out in 
Figure l7jTI in the spiked population model separability between signal and noise 
subspaces precludes separability within the signal subspace. 

As can be observed in Figure 17.131 the discrepancy between the estimated 
mean values for e^EsE^e 4 and the prediction in Equation (17.1121) is within the 
fourth decimal point for a set of 10 2 Monte Carlo trials and an array of M = 400 
elements. While the match for Equation (17.1121) is quite good even for small 
arrays, the projections of the sample eigenvectors onto the individual true signal 
subspace eigenvectors, |efe,j, are separately observed to deviate significantly 


T= 15,M=20, SNR = 14 dB 
Mean 0.9767, Predict 0.9789 



T = 150, M = 200, SNR = 4 dB 
Mean 0.9798, Predict 0.9795 



T = 300, M = 400, SNR = 1 dB 
Mean 0.9797, Predict 0.9796 



Correlation Correlation Correlation 

FIGURE 7.13 Projection of the fourth true eigenvector onto the sample signal subspace (scenarios 17, 1091 
through l7.11U . All scenarios show significant MUSIC breakdown (~60%, ~30%, and ~15%, respectively), 
but the mean projection is converging to the predicted value in Equation 47. 1121 . with a high value that indicates 
little subspace leakage. Source: © 2008 IEEE, with permission. 
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Subspace Swap and MUSIC Performance Breakdown 



FIGURE 7.14 Eigenvalue distributions for the first four eigenvalues and noise eigenvalues for 
the scenario in Equation fTTiol SNR = 4 dB. Note the significant overlap between the second and 
third eigenvalue distributions. Source: © 2008 IEEE, with permission. 


from Equation (17.92b even for large arrays under some circumstances. The prob¬ 
lem occurs in scenarios where the spiked population model is too restrictive 
and the eigenvalue-splitting condition is not met for all signal subspace eigen¬ 
values, contradicting the general assumption of spiked population models. For 
example, in the scenario in Equation (17.110b with M — 200 elements, T= 150 
samples, we observe a very small difference between the second (A 2 = 508) and 
third (A 3 = 498) eigenvalue (see Figure l7J~4l) . which indicates that the associated 
clusters in (A) may not be separated. Indeed, a rapid calculation according to 
the procedure in Figure [7/71 indicates that the support of (A) in this scenario 
is given by 


S M = [0.025,4.53] U [54.82,75.78] U [389.42,618.88] U [811.45,1116] 


^5 




X4 










so that clearly the cluster associated with A 2 , A 3 is not separated, as the spiked 
model implicitly assumes. This is an intrinsic limitation of the spiked covari¬ 
ance matrix model, which does not accurately model the finite M situation for 
the second and third eigenvalues. Of course, this limitation would be mitigated 
by allowing M to increase further while keeping the multiplicities of the signal 
eigenvalues fixed. This procedure would shrink the support of the signal eigen¬ 
value clusters to zero, guaranteeing that separability holds for all eigenvalues (as 
is the case in the spiked population model). 

As a consequence of this limitation of the spiked population model, intra¬ 
subspace swap for eigenvectors 2 and 3 was frequently observed, and the sample 
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0.96 0.98 1 0 0.5 1 

Correlation—Eigenvector 1 Correlation—Eigenvector 2 


Mean 0.699, Prediction 0.9987 



Mean 0.9882, Prediction 0.9897 



0.96 0.98 1 


Correlation—Eigenvector 3 Correlation—Eigenvector 4 

FIGURE 7.1 5 Inner product of sample and true eigenvectors for an M = 200-element array and 
the scenario in Equation fTTiol Correlation of the second and third sample eigenvectors with 

u A 

their associated true eigenvectors |(jr/)| is poor because of frequent eigenvector swap, as is the 
match between the observed mean and the prediction from Equation 17.921 . Such an intra-subspace 
swap should not affect MUSIC or any other subspace-based technique. Source: © 2008 IEEE, with 
permission. 


distribution of |(e^e 2 )| is spread widely over the [0 1] interval, as seen in 
Figure l7J~5l While this intra-subspace swap phenomenon does not persist asymp¬ 
totically ( M,T -> oo, M/T —> y) as the relative dimension of the signal subspace 
vanishes, it still indicates that, for breakdown analysis on finite arrays using the 
spiked covariance model, the projection onto the entire subspace rather than 
individual eigenvectors is the appropriate metric. 

Finally, the most important observation from the MUSIC breakdown stand¬ 
point is that for both “proper” trials with no outliers and “improper” trials with 
at least one outlier, the minimal signal subspace eigenvector still resides in the 
sample signal subspace with more than 95% of its power, converging asymptot- 
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Sources at [-20°, -10°, 35°, 37°}, 
1000 Trials/SNR 



Sources at [-20°, -10°, 35°, 35.2°}, 


50 Trials/SNR 



(a) 


(b) 


Sources at [—20°, —10°, 35°, 35.1°}, 
20 Trials/SNR 



(c) 

FIG U RE 7.16 Comparison of predicted and observed projection of the fourth sample eigenvector 
onto the true signal subspace, (a) M = 20, T = 15; (b) M = 200, T = 150; (c) M = 400, T = 300. The 
correspondence between the observations and the predictions above \\ < 1 + *Jy is accurate even at 
small array sizes such as M = 20. Source: © 2008 IEEE, with permission. 

ically to 98%. This convergence is accurately predicted by Equation (17.1121) for 
multiple SNR values, as indicated in Figure ITTbl 

The main conclusion that is now supported both by large Random Matrix 
theory and direct Monte Carlo simulations is that, for the considered sce¬ 
nario, subspace swap is not responsible for the MUSIC breakdown phenomenon 
observed, and the underlying mechanism requires further exploration. 

7.5.5 Source Resolution and MUSIC Performance 
Breakdown 

Careful examination of the pseudo-spectrum produced during trials with MUSIC 
outliers, such as in the scenario in Equation (17.641) with SNR = 14 dB, shows that 
in many trials the MUSIC algorithm selected an erroneous peak at 6 er despite the 
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fact that the pseudo-spectrum value at that peak was significantly smaller than 
the pseudo-spectrum values at any true source direction: 


S n (O er )E s E*S(0 er ) «.S u (0 ] )E s EfS(O J ), j= 1,.... 4 


(7.113) 


This happened only because MUSIC was unable to resolve the third and fourth 
closely located sources (63 = 35°, 64 = 37°) and instead found a single maxima 
in their vi cinity. This well-known ph enomenon of loss of MUSIC resolution 
capability ( Kaveh and Barabell . 19861) is not directly associated with the “sub¬ 
space swap” phenomenon, and in fact is associated with a significantly smaller 
portion of sample signal subspace energy residing in the noise subspace than 
is required for subspace swap as defined in Equation (17.94k This fact has been 
demonstrated by the experimental data in Figures 17. 1 31 through 17~ 1 61 as well as 
by the large Random Matrix theory prediction (17.112b . 

To examine the effe ct of this lo ss of resolution further, a “resolution event” 
needs to be defined. In Cox (Il973 ). Cox defines a resolution event for closely 
spaced sources, such as the DO As 63 and O4 in the scenario in Equation (17.64b . 
as 


a(03, 0 4 ) = -2 S n (d4E s EfS(d4 


+ S H (d 3 )EsEfS(d3)+S H (d4)E s EfS(d 4 ) > 0 


(7.114) 


where 6* = (63 -{-Of)/ 2. This condition means that the sample MUSIC pseudo¬ 
spectrum in the mid-point 6* between the true DO As (63 and O4) lies below the 
line that connects the pseudo-spectrum values at the true DOAs. 

Although this definition is commonly used, it should be noted that this 
metric (a ^ 0) may only approximate the actual resolution event. Indeed, in Fig- 
ure !7.171 two actual examples from the set of Monte Carlo trials in the scenario in 


Iteration 15 (seed 1234), 15-dB SNR 



Azimuth (degrees) 
(a) 


Iteration 2 (seed 1234), 15-dB SNR 



Azimuth (degrees) 
(b) 


FIGURE 7.17 Resolution of two closely spaced sources from the scenario in Equation J7.641 . 
SNR = 15 dB. (a) a > 0, but only one maxima; (b) a < 0, but two maxima. The two example trials 
illustrate the difference between the resolution event criteria 1771141 and the local maxima criteria 
used in the simulation. Source: © 2008 IEEE, with permission. 
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Equation (17.641) with an SNR of 15 dB are illustrated. The resolution definition 
is compared to a local maxima criterion, with Figure 17.171 a) showing success 
of the resolution event when local maxima tests fail, and Figure 17.171 b) con¬ 
versely showing failure of the resolution event when local maxima tests succeed. 
These variations are statistical, however, and on average the probability that 
Equation (17.1141) is positive should correspond well to the probabi lity o f actual 


iU) is positive snould correspond well to tne probabi lity o r actual 
Fee and Wengrovitz ( 1990l) . Xu and Kavehl (Il994 ). and Zhang 


resolu tion. In 

( 19951) . analytic approximations of this resolution probability are found for the 
standard asymptotic conditions (M = constant, T —>► oo) and proven to b e quite 
accura te for T^M, as in the example with M = 8, T =100, considered in Zhangl 
( 19951) . However, for M/T > 1, it is once again more appropriate to apply large 
Random Matrix theory asymptotic analysis. In accordance with the introduced 
methodology, one can find a nonrandom (deterministic) function g( 0 ^, O 4 ), such 
that 


\a(0 3 ,0 4 )-g(0 3 ,0 4 )|—>0 (7.115) 

with probability one as M,T —> 00 , M/T —> y< 00 . The condition #(#3, 64) = 0 
is then associated with roughly 50% probability of resolution, and the decision 
threshold 


res 

£(£ 3 , £ 4 ) ^ 0 (7.116) 

no res 

with high (and low) resolution probability correspondingly. 

First of all, in Figure [7481 sample distributions for the a (63,04) resolu¬ 
tion metric are shown, calculated for SNR = 9, 14, and 25 dB, correspondingly. 
As usual, sample distributions (for SNR = 14 dB) averaged over the subsets of 
improper and proper MUSIC trials are introduced (recalling that for SNR = 9 dB, 
practically all MUSIC trials are improper while for SNR = 25dB, all trials 
are proper). Although trials with outliers in general have a negative resolution 
metric (regardless of SNR) and those without outliers have a positive metric, 
there is some overlap at SNR = 14dB. The overlap between these two distri¬ 
butions is explained by the circumstance instead of the phenomenon illustrated 
in Figure 17.171 since the Monte Carlo simulation DOA estimation algorithm 
searches for strict maxima in the pseudo-spectrum rather than applying the 
subject resolution metric. 

To get an analytic expression for the asymptotic function g{ 0 ^ , 64) in Equation 
(174451) . let us once again consider Theorem 17.21 or the corresponding version 
for the noise eigenvectors given in Equation (17.821) . According to this result, if 
the signal/noise subspace-splitting condition (17.751) is satisfied, then the MUSIC 
random pseudo-spectrum (p(&M ,0 ) in Equation (17.681) asymptotically (M, T -> 
00 , M/T —>► const) tends to the nonrandom function <p(Rm, 6): 

| cp(Rm, 0 ) — <p(Rm, 0)1 — > 0 as M, T ^ 00 , M/T — > y, 0 < y< 00 (7.117) 
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where 


M 

<P(Rm, 0) = w music (fc)S H e£e ^S 

k=\ 


and where w music( k) are specified in Equation (17.82 ). 
According to this result, 


S H (0)E s EfS{0)=>l-<p(RM,e) (7.118) 


&(0 3 , d 4 ), T= 15, M = 20, 9-db SNR &(0 3 , d 4 ), T = 15, M= 20, 15-dB SNR 



(a) (b) 


a(0 3 , e 4 ), T= 15, M = 20, 25-dB SNR 



(c) 

FIGURE 7.18 Resolution metric a for closely spaced sources 0 3 =35°, 64 = 31° in the scenario in 
Equation 17.641 . (a) Complete MUSIC breakdown; (b) ~50% MUSIC breakdown; (c) no MUSIC break¬ 
down. For SNR= 14dB, the sample distributions are practically symmetric w.r.t. zero, with trials with 
outliers shifted into the negative domain and trials without outliers shifted into the positive domain. For 
SNR = 9dB with ~100% MUSIC breakdown, the sample distribution resides entirely in the negative 
domain, while for SNR = 25 dB with no MUSIC breakdown, it resides entirely in the positive domain. 
Source: © 2008 IEEE, with permission. 






























7.6 Subspace Swap and MLE Performance Breakdown 


since we assume ||S(0)|| = 1, and therefore the function ^(^ 3 ,^ 4 ) in Equation 
(17.1151) is expressed as 


M 


&(#3, 64 ) = - ^2 w MUSIC (k) 


k=l 


S H (#3)e£ 


S H (0*)e* 


S H (0 4 )e t 


(7.119) 


According to this expression, one gets the asymptotic values of g(0 3 ,0 4 ) = 
—0.0066, —0.0004, and +0.0015 for input SNRs of 25, 14, and 9dB, respec¬ 
tively, which agree precisely with the mean values in Figure [74+1 

Consistent with the earlier conclusion that subspace swap was not the sole 
reason for MUSIC outlier production, one can now see that MUSIC’S per¬ 
formance breakdown can occur when it is unable to resolve some closely 
spaced sources; for this to happen, quite negligible power of the actual sig¬ 
nal subspace has to reside in the sample noise subspace (“subspace leakage”). 
Another important conclusion is that the large Random Matrix theory method¬ 
ology has been proven to be sufficiently accurate for scenarios with surprisingly 
small M (=20) and T (=15), compared with the asymptotic requirements 
(M ,T -> 00 , M /T -> constant), and can provide accurate predictions of MUSIC 
resolution performance. 

The introduced analysis sheds some light on the reasons for the disappoint¬ 
ing threshold pe rformance improvement d elive red by the G-MUSIC algori thm 


(Mestre 


2006a b: see also Mestre. 2008a. and 


Mestre and Lagunasl 120081) . as 


observed in Section 17.41 While G-MUSIC is indeed able to counter the bias 
in noise subspace estimation caused by the low sample support, the almost sure 
convergence to the accurate zeros of the cost function still may possess sufficient 
variance such that the value (p(Rm , 0 *) at a midpoint between the two closely 
spaced sources is even smaller, leading to a single minima (or maxima for the 
inverse pseudo-spectrum) and corresponding loss of resolution. 


7.6 SUBSPACE SWAP AND MLE PERFORMANCE 
BREAKDOWN 

In a way, the improved performance of MLE in the threshold region relative 
to MUSIC (and G-MUSIC) is reflected by the fact that MUSIC breaks down 
signifi cantly earlier than estimation of the number of sources m bv ITC ( Lee 


and Li, 1 19931) . Both the MLE criterion and the ITC approach test the entire 
covariance matrix model to fit the sample data, while MUSIC (and G-MUSIC) 
selects each DOA estimate independently, with no regard to how the entire set 
of produced estimates fits the input data. Of course, there is still the possibility 
of different threshold conditions for MLE and ITC as well, but with a signif¬ 
icantly smaller gap. The significant difference between MUSIC-specific and 
ML-intrinsic threshold conditions is the penalty one has to pay for replacing the 
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multivariate ML optimization problem that finds the set of estimates that jointly 
best fit the input data, by univariate search of the function that only has the same 
solution as T —> oo. 

MLE breakdown is observed when a set of DOA estimates that contains a 
severely erroneous estimate (an outlier) generates an LF value that exceeds the 
local extremum in the vicinity of the true model. In other words, there are maxima 
of the LF (including the global maximum) that exceed the local maximum of 
the LF considered by traditional asymptotic ML analysis (T —> oo) as the ML 
solution. Generation of such an outlier is dependent on proper detection of the 
number of sources present, which may also be questionable in this ML-intrinsic 
threshold condition (an issue not explored in detail here). 

For a solution that contains an outlier to be “more likely” than the actual 
covariance matrix, the sample data should indeed generate a sample signal sub¬ 
space with some of its elements better represented by the true noise subspace. 
Therefore, the subspace swap phenomenon is more likely to be associated with 
MLE breakdown than with the breakdown in subspace techniques. To demon¬ 
strate this, let us analyze MLE performance in the scenario in Equation (17.64 ) 
at three SNR values: 2dB with ^4 = 4.97, OdB with 7.4 = 3.51 (again as in 
Equation 17.1071) . and —4 dB with 7.4 = 2.00. 

For SNR = 2dB and 7.4 = 4.97, the Mestre eigenvalue-splitting condition 
(17.751) for the fourth eigenvalue is just satisfied (0.68 < T/M = 0.75), indicating 
(almost sure) separation of the signal and noise subspaces asymptotically (in the 
GSA sense), while according to Equation (17.1121) for this SNR, 


a.s. 


e 4 H £ s £' 5 H e 4 =► 0.69 


(7.120) 
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Monte Carlo simulations show a mean for EsE^e^ of 0.6815, agreeing well 
with the prediction and indicating that the fourth eigenvector projects more onto 
its proper signal subspace than the noise subspace. 

For SNR = OdB and 7.4 = 3.51, the Mestre eigenvalue-splitting condition 
(17.75b is marginally violated (1.07 > T/M = 0.75), while the projection of signal 
eigenvectors onto the signal subspace is forecast via Equation (17.112b as 


e^EsE^e4 —> 0.51 


H 


a.s 


(7.121) 
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Monte Carlo simulations show a mean for e^EsEfre 4 of 0.5098. While the fore¬ 
cast in Equation (17.112b is not technically valid when the splitting condition is 
violated, at this marginal condition it still provides a projection value prediction 
that is consistent with simulation results. The projection value of around 50% 
indicates that the subspace swap condition given in Equation (17.95b is essen¬ 
tially satisfied and a subspace swap is statistically likely, consistent with the 
observation in Figure 17.31 that MLE breakdown starts to occur at 0-dB input 
SNR. 

For SNR = —4 dB, 7.4 = 2.0, and the condition for asymptotic convergence 
( 7.4 > 1-|- <v /y = 2.15) given in Equation (17.86b is violated, so the projection 






























7.6 Subspace Swap and MLE Performance Breakdown 


should converge to zero asymptotically and Equation (17.1121) is clearly no longer 

TT A A T T 

valid. Monte Carlo simulations show a mean for e^EsE^e 4 of 0.2502. The sce¬ 
nario is obviously experiencing significant subspace swap, once again consistent 
with results given in Figure 1731 

Having established that MLE performance breakdown in the examined 
multiple-source scenario is reliably associated with subspace swap (whereas 
MUSIC breakdown in the same scenario is associated with subspace leakage , as 
shown in Section[73] and resultant loss of resolution, as shown in Sectio nl7.5.f I). 
let us next e xamine a single-source scenario, such as that studied in lAthlev 
( 20021120051) . where one expects that subspace swap is indeed the sole mech¬ 


anism responsible for both MLE and MUSIC DO A estimation performance 
breakdown. 

To this end , an additiona l scena rio with a single target is introduced that 
is based , as in Athley ( 20021 2005 ). on a sparse minimum redundancy array 


(MRA) (IMoffetLll968l) . where the generation of outliers is more likely as a result 
of poor sidelobe performance. The following specific M = 18 configuration 
{d = [0,2,10,22,53,56, 82, 83, 89,98,130,1 48,153,167,18 8,192,205,216]) 
is used, as suggested for the MRA context i n Sverdlik dl975 ) and confirmed to 
be minimally redundant (IDollas et all 1 19981) . 


The threshold effect of MLE estimation in this scenario (provided by the Bar- 
lett spectrum or conventional beamforming, CBF) can be observed in Figure[7T9 
to occur around — 5 dB for T = 14. 


18-Element MRA, T= 14, 1 Source at 0° 



FIGURE 7.1 9 MSE for MUSIC, G-MUSIC, and MLE (CBF) DOA estimation on an 18-element 
minimum redundant array with 1000 trials/SNR step. Note that the MUSIC and G-MUSIC estimators 
deliver essentially the same performance in this circumstance, as expected for single sources. Source: 
© 2008 IEEE, with permission. 
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MRA, Source at [0°], SNR = -18 dB 
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FIGURE 7.20 Distribution of DOA estimation errors versus projection ejEsE^ej for a single 
target scenario on an 18-element minimum redundant array with T = 14 samples, (a) ~100% MLE 
breakdown; (b) ~50% MLE breakdown; (c) rare MLE breakdown; (d) no MLE breakdown. As SNR 
is increased, the projection approaches unity and the estimation accuracy improves. Note, however, 
that trials with low projection values still frequently have low estimation errors. Source: © 2008 
IEEE, with permission. 


In Equation (17.941) subspace swap was defined as occurring when the pro¬ 
jection of the last true eigenvector into the underlying sample noise subspace 
was higher than that into the sample signal subspace. To examine whether this 
subspace swap is the sole mechanism for MLE breakdown, the DOA error of 
a single source estimated with the MRA, versus the correlation between the 
“maximal” sample and the true eigenvector, can be plotted for each of 1000 
Monte Carlo trials and a sample size of T = 14. These plots are shown in 
Figure 172201 for source SNRs ranging from very low values that result in com¬ 
plete MLE breakdown (input SNR of —18 dB, as shown in Figure lT2CT aY) to 
values where there is no MLE breakdown (input SNR of OdB, as shown in 
FigureEMd)). 

Figure [7f20l clearlv demonstrates that when projection of the signal true eigen¬ 
vector onto the sample signal subspace is high, there is no MLE breakdown (i.e., 
the upper right quadrant of Figures I7.20r a)-(d) are all free of any Monte Carlo 





































Section 


Conclusion 


trials). Interestingly, however, the converse is not true. When the projection of 
the signal true eigenvector onto the sample signal subspace is low, a DOA outlier 
estimate may or may not be produced. Thus, subspace swap is a necessary but 
not sufficient condition for MLE breakdown to occur. 

7.7 CONCLUSION 

In this chapter, the expected likelihood approach was introduced as a mechanism 
that allows assessment of the quality of estimates without resort to asymptotic 
or clairvoyant analysis. To address an important low-sample support regime, 
the expected likelihood approach was expanded into the sample data circum¬ 
stances where the number of samples T does not exceed the dimension M of 
each snapshot. Three basic requirements for likelihood ratios to be used in an 
undersampled expected likelihood framework were detailed, the most important 
being the property that the p.d.f. of the likelihood ratio for the unknown true 
model Ro must be independent of any scenario. Undersampled likelihood ratios 
that satisfy these requirements were then derived via two different mechanisms: 
projection of the covariance matrix model onto the subspace spanned by the sam¬ 
ple covariance matrix, and a maximum-entropy extension of the central band of 
the rank-deficient sample covariance matrix. 

The expected likelihood detection/estimation methodology was applied over 
an important class of “undersampled” scenarios, where the number of inde¬ 
pendent Gaussian samples T does not exceed the number of antenna array 
elements M, by applying likelihood ratios formulated for operation with singular 
sample covariance matrices. Specific application of the introduced method¬ 
ology dealt with the well-known “threshold effect” in MUSIC, where the 
lack of SNR and/or sample support T below certain threshold values causes 
MUSIC to generate severely erroneous DOA estimates (outliers) with high 
probability. 

It was shown that a gap between MUSIC-specific breakdown and ML- 
intrinsic breakdown in the sample covariance matrix Rm itself can be exploited 
to reliably identify outliers, since they generate LR u (Xt\Rmusic) values that 
are statistically much smaller than LR u (Xt\Rq). Analytically or experimentally 
calculated “scenario-free” threshold values h* with 

Prob[LR U (X T | Ro ) < h*] = P FA « 1 (7.122) 

were then used, and demonstrated quite remarkable efficiency of the MUSIC- 
specific breakdown prediction. 

Examination of the performance of another DOA estimation algorithm 
(G-MUSIC) derived from consistency requirements in a doubly asymptotic 
framework rather than via the standard large-sample asymptotic approach 
showed some improvement in outlier production, allowing reliable operation 
at lower SNRs. However, a gap between the G-MUSIC breakdown and the 
ML-intrinsic breakdown was still shown to exist. 
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To explore this further, a number of results from Random Matrix theory based 
on the asymptotic environment, where both the sample support T and the array 
dimension M go to infinity at the same rate, were studied. Interestingly, it was 
demonstrated that the breakdown in Rmusic takes place at much higher SNR 
and/or sample support values T than those predicted by the Random Matrix the- 

/V 

ory subspace swap condition for the sample covariance matrix Rm . Specifically, 
it was demonstrated that the “phase transition” predicted by RMT for the sce¬ 
nario with Mxm signal subspace of the covariance matrix (k m < i+vmn) 
is associated in multiple-source scenarios not with MUSIC-specific or G-MUSIC 
specific breakdown but rather with ultimate ML-intrinsic breakdown that cannot 
be recovered. This was shown by verifying that MUSIC-generated covariance 
matrix models Rmusic contain outliers in scenarios with X m (Ro) 1 + ^/( M/T ). 
It appears that MUSIC-specific threshold conditions in multiple-source scenarios 
are triggered by very subtle changes in the eigendecomposition of the suffi- 

/V 

cient statistic (sample covariance matrix) Rm (“subspace leakage”) rather than 
the full subspace swap traditionally associated with MUSIC breakdown in the 
literature. 

Customarily, subspace swap is described as an event where a particular (min¬ 
imal) signal subspace eigenvector is better represented (expanded) by the noise 
subspace of the sample covariance matrix than by the sample signal subspace. 
The analysis here demonstrated that the RMT/GS A methodology very accurately 
predicts the subspace swap conditions, even for antenna dimensions and sample 
volume that are far from the M, T asymptotic regime. The same analysis showed 
that MUSIC (and G-MUSIC) performance breakdown can take place when less 
than 5% of the minimal signal subspace eigenvector’s power resides in the sam¬ 
ple noise subspace, leaving more than 95% of this power residing in the sample 
signal subspace. Focusing on the multiple-source case, GSA methodology was 
found to be able to predict this breakdown condition using a standard resolution 
test, but with asymptotically justified weighting factors, allowing the definition 
of the threshold SNR and/or sample volume required for reliable resolution for 
a given array configuration and scenario. 

Returning to consideration of the full subspace swap condition, it was estab¬ 
lished that MLE breakdown is reliably associated with the subspace swap 
phenomenon and well predicted by the GSA methodology. It is obvious that 
scenarios where the MUSIC (and G-MUSIC) pseudo-spectrum does not differ 
significantly from the conventional Barlett spectrum (single or well-separated 
sources, very low SNR, T —>► oo), MUSIC and MLE techniques demonstrate 
similar threshold performance, with full subspace swap becoming the common 
reason for breakdown in both techniques. For single-source cases, the similar 
breakdown point for MLE and MUSIC is well correlated with GSA-derived 
eigenvalue-splitting and subspace swap predictions. It was noted, however, that 
while high projection values of the minimal signal eigenvector onto the sample 
signal subspace precludes the formation of outliers leading to performance break¬ 
down, low projection values (indicating in some cases almost complete subspace 
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swap) did not always lead to performance breakdown. Therefore, subspace swap 
is a necessary, but not sufficient, condition for DOA estimation breakdown, with 
other factors also influencing outlier production, such as statistical variations of 
the source power and manner of distribution of the signal subspace power across 
the sample noise subspace. 

These investigations once again point out the need for performance assess¬ 
ment tools other than the standard large-sample asymptotics when dealing with 
finite-sample threshold region behavior. The power of the RMT/GS A asymptotic 
approach is particularly evident here, as is the capability of expected likeli¬ 
hood to provide nonclairvoyant solution assessment in scenarios outside the 
ML-breakdown region but within SNR and/or sample support regimes whether 
or not other estimation methods fail. 

In summary, the finite-sample performance assessment objectives laid out at 
the beginning of the chapter can be achieved using new applications of like¬ 
lihood ratio formulations as well as Random Matrix theory, providing new 
tools for the assessment of DOA estimation algorithms in the low training- 
sample environment, which is increasingly dominating the field of adaptive 
processing. 
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High-Resolution DOA 
Estimation with Higher-Order 
Statistics 


Pascal Chevalier, Anne Ferreol, Laurent Albera 


8.1 INTRODUCTION 


Direction-of-arrival (DOA) estimation is one of the main components of 
spectrum-monitoring and radio surveillance systems, but it also finds its place 
in radar, sonar, and human electrophysiology for localization of electromag¬ 
netic activities. From the beginning of the 1980s, many second-order (SO) 
high-resolution (HR) DOA estimation methods have been developed, mainly 
to overcome the limitations of classical methods such as Watson-Watt or inter¬ 
ferometry in multiple-source contexts or for multipath propagation channels. 
Among th ese, subspace-based methods such as MUSIC (or 2-MUSIC) ( Bienvenu 
and K opp. 1983 : Schmidt . 1986b and ESPRIT (Paulrai et alT 19861: Rov and 


Kailath, 1 1989h . very powerful in multiple-source environments, are the most 


popular. Indeed, in the absence of modeling errors and for a background noise 
with a spatial coherence that is known (MUSIC) or equal to the Identity matrix 


the source signal-to-noise ratio (SN 

RU 

Germain et al. 

i—* 

^o 
oo 

Kaveh and Barabell 

1986 

: IPorat and Friedlanded. 1988: 

Stoica and Nehorai. 

198 

9.1199(1). 


However, these methods suffer from serious drawbacks. For one, they are 


sensoi 

rs. Moreove 

r, they 2 

ire weakly robust both to modeling errors ( 

Ferreo 

et all 

2006; 

Friedlander 

. 199(1 

Li and Vaccaro. 

19921: Swindlehurst and Kailath. 

1992) 


(always present in operational contexts) and to the presence of strong spatially 
correlated background noise with a spatial coherence that is unknown (Paul rai 
and Kailath. il 9861) . such as that in the HF band dPemeure and Chevalien.il 99 8b or 
in human electrophysiology. Finally, their performance may be strongly affected 
when several sources with a small angular separation have to be separated from 
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a lim i ted nu mber of snapshots (IKaveh and Barabelll 1 19861 : IStoica and Nehorail 


1989L 


m3). 


Since the end of the 1980s, mainly to overcome most of the previously 
mentioned limitations, higher-order (HO) HR DOA estimation methods (mainly 
fourth-order, FO), have been developed for non-Gaussian sources, omnipresent 


The first works on HO F 

l DOA estimation (Cardosol 

1990l: Chiang and Nikias. 

1989 

Forster and Nikias 

1991: Porat and Friedlander. 

1991 

Swami and Mendel. 

1991 

) were mainly devoted to the introduction of HO methods that are asymptot- 


noise is Gaussian. Methods proposed in ! 

2 ardoso 

(1990b. Chiang and Nikias 

dl989). IPorat and Friedlander (199 

), and 

Swami and Mendel (1991) exploit 

the FO cumulants (Amblard et al.. 

19961; 

Mendel. 

1991) of the data. Among 

these. Chiang and Nikias d 1989b and 

Porat and Friedlander! (1991) proposed FO 


extensions to the ESPRIT and MUSIC methods, respectively (called 4-ESPRIT 
and 4- MUSIC in this chapter). In contrast, the method proposed in Forster and 

Nikias (1199 lh works in the bispectrum (IMendellll991) domain ._ 

At the same time, it was proved that both FO ( Cardosol. 1990h and HO 
(IShamsunder and Giannakisl 1 19931) cumulant-based methods allow for an 
increase in both resolution and the number of sourc es to be processed, beyond the 
nu mber of sensors. These prope rties are shown in lChevalier and Ferreoll (119991) 
andlDogan and Mendell dl995al) for FO methods, and recently in lChevalier et al 


(120051) . for HO methods at an even order, to be directly related to a virtual increase 
in both the effective aperture and the number of sensors in the array, induced 
by the exploitation of the FO or HO data cumulants. Thus, both the F O and HO 
virtual array (VA) concepts, introduced in Dogan and Mendel ( 1995a L Chevalier 


and Ferreol (119991) . and lChevalier et alJ (12005b . respectively, which give a geo¬ 
metrical interpretation of cumulants and expl ain most of the advantages o f HO 


1995al) even 


cumulant-based DOA estimation methods. In Dogan and Mende] ( 
the symmetries of the FO VA were exploited to propose a kind of FO ESPRIT 
method, called FO Virtual ESPRIT, valid whatever the geometry of the array 
provided that at least two sensors share the same response. 

Following these first works, other FO cumulant-based methods were devel¬ 
oped with specific prop erties. Such properties concern in particular robus tness 
to non-Gaussian noise ( Chen and Linl 1994a: iDogan and Mendel . 1995bh: the 
ability to process coherent m ultipaths (IChen and Linl 1 1994b 


Gonen et al, 


3 ); the 

TL997 


Yuen and Friedlanderf.ll997h. near-field sources (|Challa and Shamsundei 


19981) . 


nonstationary sources (ILiu and Mendell Il999ah. and, in a selective manner, 
cyclo-stationary sources (IShamsunder and Giannakisll 1994b : and the capability 
to estimate both D OA and polarization parame ters from an array with diversely 
polarized sensors (IGonen and Mendell 1 1999b. At the same time, asymp totic 


sensitivities of FO MUSIC-like methods (Cardoso and Moulines . 1995 


Fan 


and YounanT Tl995b. 4-ESPRIT ( Yuen and Friedlanderi 1 1996L Il998b. and FO 
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with re spect to both finite-sample effect (IC a rdoso and Moulineslll995l: Fan and 
Younam 1995: lYuen and Friedlander . 1996 


1995 


Liu and Mendel 


1998) and model errors (Fan and 


1999bl) were analyzed. It has been shown that 


Younan, 

FO methods are more robust to modeling errors and more sensitive to finite- 


sample effect than SO methods, especially 
angularly separated (ICardoso and Moulines 


or we ak sources that are properly 


19951) 


This latter point, jointly with a higher numerical complexity, is the main 
weakness of FO and HO cumulant-based methods. One way to decrease the 
numerica l complexity is to implement HO-contracted methods I Cardoso and 
Moulines. 1 19951) . These methods exploit only a part of the HO information and 
are intermediate between SO and HO methods, but at the expense of the number 
of sources to be processed, which cannot be greater than the number of sensors 


minus one. One way to decrease the sensitivity of HO methoc 


effect is to exploit both the SO and HO data statistics (ICardosoL 


s to finite-sample 


19941) in the DOA 


estimation process. For example, one may blindly identify the steering vectors 
of the sources with a well-known blind identification method that use s both SO 
and FO data statistics ( Cardoso and Souloumiac . 19931 : Comon . 1994 ). and then 


esti mate the DOA of the sources from the estimated steering vectors ( Chevalier 
et al. . 119961) . Such methods are robust to both finite-sample effects and model 


errors but at the expense of both robustness with respect to an unknown colored 
Gaussian noise and the number of sources to be processed, which cannot be 
greater than the number of sensors. Another way to overcome the drawbacks of 
HO methods in the presence of a high dynamic range between sources is to add 
some defla t ion sc hemes, as was proposed recently in lAlbera et al.1 (120081) and 
( 20071) . especially for brain source localization. 


Birot et al, 


Although most of the developed HR DOA estimation methods exploit the 
FO cumu lants of the data, some use HO statistics such as hybrid nonlinear 
moments (IJacovitti and Scaranolll994) or generalized HO cumulants f Scarano 
and Jacovitti. l 19961) . Moreover, recent (IChevalier et alll2005l) important insights 
into the mechanisms of array-processing methods exploiting the cumulants of 
the data at an arbitrary even order, m = 2q(q>\), were achieved. This was 
accomplished through the introduction of the 2gth-order VA concept, for several 
arrangements of the 2gth-order data statistics and for arrays with either identical 
or diversely polarized sensors. This concept allows for showing the existence 
of an optimal arrangement of the 2gth-order data statistics and the increasing 
resolution and increasing processing capacity of 2gth-order array-processing 
methods as q increases, while keeping their robustness to a colored Gaussian 
noise. 

To exploit these properties for DOA esti mation, an extension of MU SIC 
to 2qih order was introduced very recently in IChevalier et al. ( 2006 . 2007 ) for 
arrays with identical and diversely polarized sensors, respectively. This gave rise 
to 2g-MUSIC and several PD-2g-MUSIC (Polarization Diversity 2g-MUSIC) 
methods. One PD-2g-MUSIC method is nothing more than th e 2gth-order 
extension of Ferrara and Parks’s method (IFerrara and ParksL 


19831) 
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The increased processing capacity offered by the 2g-MUSIC algorithm as 
q increases allows for minimizing the number of sensors and reception chains 
for a given number of so urces. Also, despite its higher variance, it was shown 


in 


Chevalier et alJ (120061) that, in the presence of multiple sources, 2g-MUSIC 


also has a robustness against modeling errors that increases with q. This may 
allow in particular some constraints about antenna calibration or receiver chain 
equalization to be relax ed. Finally, a no ncircular extension to the 2g-MUSIC 
method was proposed in Liu et alJ ( 2008 ) to further increase the performance of 
the latter for BPSK sources. 

The aim of this chapter is to present the philosophy, properties, implemen¬ 
tations (both parallel and sequential), and performance of HO HR DOA esti¬ 
mation methods through description of the families of the 2^-MUSIC and 
PD-2g-MUSIC algorithms, which, in contrast to HO extensions of the ESPRIT 
algorithm, are well suited for any kind of array geometries. HO cumulants, 
along with the observation model and its HO statistics, are presented in 
Section l8^2l The 2g-MUSIC method for arrays with identical sensors, along with 
its possible parallel and sequential implementations, is presented in Section [831 
Section 18.41 introduces the HO VA concept and the identifiability issue of the 
2^-MUSIC method. Asymptotic performance, in terms of precision and resolu¬ 
tion, of 2g-MUSIC in the presence of modeling errors is computed analytically 
and illustrated in Section l831 Some computer simulations illustrating the perfor¬ 
mance of 2^-MUSIC methods from a finite number of snapshots are presented in 
Section [8761 Section [8771 introduces the philosophy and performance of PD-2g- 
MUSIC methods, which correspond to extensions of 2^-MUSIC to arrays with 


diversely polarized antennas. Finally, Section 1 8.8 1 concludes the chapter. 


8.2 OBSERVATION MODEL AND DATA STATISTICS 

The observation model is presented in Section l8.2. II while SO and HO statistics 
of the data are described in Section 18.2.21 

8.2.1 Observation Model 

We consider an array of N narrowband (NB) identical sensors, and we call x(t ) 
the vector of complex amplitudes of the signals at the sensors’ output. Each 
sensor is assumed to receive the contribution of P (P is potentially greater than 
or equal to N ) zero-mean stationary (to simplify the presentation but not required 
in practice) NB sources, which may be statistically independent or not, corrupted 
by a noise. We assume that the P sources can be divided into G groups, with 
P g sources in group g, such that the sources in each group are assumed to be 
statistically dependent but not coherent (i.e., not fully correlated), while sources 
belonging to different groups are assumed to be statistically independent. In 
particular, G = P corresponds to P statistically independent sources, whereas 
G = 1 corresponds to the case where all the sources are dependent. Of course, 
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the P g parameters are such that 

G 

P=J2 p s (8-D 

8 = 1 

Under these assumptions, the observation vector can approximately be written 
as follows: 

p 

x(t) = y^m l (t)a(O l , Pi) + v(t) =Am(t) + v(t ) 

i= 1 


G 


G 


= '^2A g m g (t)+v(t) = '^2xg(t) + v(t) ( 8 . 2 ) 

S=1 §= l 

In Equation (18.21) . v(t) is the noise vector, assumed zero mean and Gaussian; 
m{t ), independent of v(t ), is the vector with components m*(Y) that are the com¬ 
plex amplitudes of the sources, 0/ = (0j, (pi), where 6( and cpi are the azimuth and 
the elevation angles of source i (Figure 8T1 ) : P t (characterized by two parame¬ 
ters in the wave plane) dComptonL 119881) characterizes the state of polarization 
of source i; A is the (NxP) matrix of the source steering vectors a(0i, 
(l<i<P), which contains in particular information about the DO A of the 
sources; A g is the (N x p :<) submatrix of A corresponding to the gth group 
of sources; m g {t) is the corresponding (P g x 1) subvector of m(t) and x g (t) = 
A g m g (t). In the absence of modeling errors, and of mutual coupling between the 
sensors in particular, assuming a planewave pr opagation, compo nent n of vector 
a(fii, Pi), denoted a n ($i , /? z ), can be written as (IComptonU 1988b 


a n (Qi, Pi) =fn(Pi, Pi) Qxp{j2ic[x n cos (Pi) cos {(Pi) 

+y n sin {pi) cos ((p t )+z n sin (^)l A) (8.3) 

In Equation (18.31) A is the wavelength; (x n , y n , z n ) are the coordinates of sensor 
n of the array; and f n (0i, Pi) is a complex number corresponding to the response 
of sensor n to a unit electric field coming from direction 0/ and having the state of 
polarization p { dCompton . 1988b . For an array with identical sensors, f n (0/, p t ) = 



FIGURE 8.1 An incoming signal in three dimensions ( Source: Reprinted from I Chevalier et al 


20051 © IEEE, with permission.) 
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f(0i, Pi) is the same for all the sensors n , and a n (0i , Pi) and a(0i, p t ) can then be 
written as a n (0 t , P t ) =/(0j, ft) a n (0i) anda(0/, ft) =/(0j, ft)« (ft), respectively, 
where a n (0i) is the exponential term of the right side of Equation d8.3b and where 
a (ft) is the vector with components a n (0i). In such situations, the term/(ft, Pi) 
may be inserted in the complex envelope, mi(t), of source i and the steering 
vector of source i may be reduced to «(ft), which is done in the following up 


to Section [8761 In other words, for arrays with identical sensors, the observation 
model is assumed to be given by Equation (18.2b with a(ft) instead of a(ft, Pi). 


8.2.2 Data Statistics 

SO and HO statistics of the data are described in the following subsections. 


Interest of Cumulants 

Let u and x be two (TV x 1) real-valued vectors such that x is ran dom. The 
£/th-o rder cumulants of x, with components x/(l < i < AT), are defined (IPriestlevl. 
19811) as the coefficients of the terms of order q in the Taylor series expansion of 
the second characteristic function, T'(m) = lnE{exp(/« T x)}, ofx. The gth-order 
cumulants of x are linked t o its pth-order mo ments (1 < p < q) by the so-called 
Leonov-Shiryaev formula (IMcCullaghLll987l) . given by 


Cum[xf 1 ,Xi 2 ,...,Xi ? ] = y^(-iy ? '(p- I )!E Y[ x ‘i 

P= 1 T'sSl 


n 

l ;sS2 


Xij 


> r. 


n 

^jeSp 


X ij 


(8.4) 


where 1 < ij < N for 1 <j < q , and where (S1, S2,..., S p) describes all the par¬ 
titions in p sets of (1,2, In particular, for a zero-mean vector x, the 

second-order, third-order, and FO cumulants of x are given by 


Cum [xi,xj] = E{ XiXj] 

Cum[xi,Xj,x k \ = E{xiXjX k } 

Cum fa ,Xj,Xk,xi] = E {x t xj xj, x/} - E{x; xj}E{xk x/} 

- E{xi x k }E{xj xi} - E {xi xi }E{xj x k } 


(8.5a) 

(8.5b) 

(8.5c) 


for 1 < ij, k,l<N. In the case of nonzero-mean vector x, one has to replace 
xi with Xj —E{xj} in these formulas. In the case of a complex vector x, one 
has to replace x; by xf l in these formulas and in Equation (18.4b . where 
£i= zb 1 with the convention x 1 =x and x -1 =x*, the complex conjugate of 
x. In this case, the gth-order cumulants of x correspond to the set of quantities 
Cum[x/ 1 £il , xi 2 12 ,..., Xj] such that 1 < ij < N and = zb 1 for 1 < j < q. 

Cumulants are frequently used in signal processing, and in array pro cessing 


in particular, since they have many important properties (IMendelL 1 199 lh : 
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[PI]. Cumulants are symmetric in their arguments: 

Cum[xi ,...,x q ] = Cumfoj ,...,x iq ] 

where (i\ ,..., i q ) is a permutation of (1,..., q) 

[P2]. Cumulants are multilinear—that is, linear with respect to each of their 
arguments: 

Cum[xi +yi,...,x q ] = Cum[xi ,...,x q \ + Cum[yi ,...,x q \ 

Cum[A vi ,,x q \ = XCum[x\ 9***9 J 

where y\ is a random variable and A is a constant. 

[P3[. Cumulants are invariant by a constant translation: 

Cumfxi + A,... ,x q ] = Cum[;q,... ,x q ] (8.6d) 

where A is a constant. 

[P4[. If collections (x\,...,x q ) and (y\,...,y q ) are statistically independent, 
then 

Cum[xi +yi,...,x q +y q \ = Cum[vi ,...,x q \ + Cum[yi ,...,y q \ (8.6e) 

[P5]. If a subset of the collection (x \,..., x q ) is independent of the rest, then 

Cum[xi,... ,x q ] = 0 (8.6f) 

[P6[. If (jci ,.. .,xn) are jointly Gaussian, then 

Cum[xi,..., Xq] = 0 for q > 3 (8.6g) 


(8.6b) 

(8.6c) 


(295) 

(8.6a) 


Second- and Higher-Order Statistics 

The 2gth-order cumulants of the data, Cumfr^ (t) ,..., Xi r (, t ), Xi r+l ( 0 * , •.., 
Xi 2 (t) * ] (1 < ij < N) (1 <j < 2 q), such that r j=- q , a r e equal to zero for 2gth- 
order circular observations ( Amblard et al. . 19961 : Picinbono . 1994 ). This is 
particularly the case in the presence of bot h a Gaussian noise and 2gth- 
order circular sources such as M-PSK source s (IProakisU 19951) for M > 2 q. For 
this reason, the 2g-MUSIC (q> 1) method ( Chevalier et al. . 20061) exploits 
the information contained in the (N q xN q ) 2gth-order circular covariance 
matrix, C2q, x , with entries that are the 2<^th-order circular cumulants of the 
data, Cumfej (t) ,..., x iq (t) , x iq+l ( 0 *, • • • , x ilq ( 0*1 (1 < ij < N) (1 < j < 2q). Note 
that the previous statistics are called circular since they are invariant by a 
phase rotation of the components xq (, t ) (lAmblard et all 19961 : 


Picinbono . 1 1994 ). 


However, the previous entries—Cumf*^ (t ),..., X( q (, t ), Xi +] (t )*,..., Xi 2q ( t ) * ] 
(1 < ij <N)(l <j < 2 q )—can be arranged in the C2 q , x matrix in different ways 


that determine in particular both the resolution anc 
of the 2g-MUSIC method ( Chevalier et al. . 12005 


he maximal processing power 


20061 ) . 


To show this result, let us introduce an arbitrary integer / such that (0 <l<q) 
and let us arrange the 2g-uplet, (i\,... ,i q , i q +\,..., i 2 q ), of indices ij (l <j < 2 q) 
into two < 7 -uplets indexed by / and defined by (i \, h, • • •, U, iq+i , •, hq-i) and 
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(hq-i+x ,..., hq, h+ 1 , • • •, i q ), respectively. As the indices ij( 1 <j < 2g) vary 
from 1 to A, the two latter g-uplets take A'? values. 

Numbering, in a natural way, the N q values of each of the two latter g-uplets 
by the integers 7/ and //, respectively, such that 1 <//,//< A^, we obtain 


l q—l 

ii=J2 Nq ~ j ^j -!)+ J2 Nq ~ H( - i 4+j - !) + 1 ^ 8 - 7a ) 

7=1 7=1 

/ 

j, 4 -1) + Yt^-ndi+j -1) +1 (8.7b) 

7=1 7=1 


In particular, for (<gr, /) = (2, 1), the integers 7/ and // are given by 7i = 
A(/i — 1) + *3 and J\ = A(A — 1) + 12 , respectively, whereas for (( 7 , /) = (2,2) 
we obtain h = N{i\ — 1) + h and J 2 = A (23 — 1) + 4 . Using the permuta¬ 
tion invariance property of the cumulants, we deduce that Cum^-j ( 0 , • •., 
*/ 4 (0, X iq+l (t)*, ..., x i2q (t)*] = Cum[^,(0, x i( (0, x, ?+1 (0*, • ••, x; 2? _,(0*> 

"^2 q- _,+i( 0 * . ■^2 q (t)*,x il+ 1 (0, .. -, jc / (/■)]• Assuming that the latter quantity is 

the element [//,//] of the matrix, thus noted C 2 q,x( 0 , it i s straightforward 
to show, using Equation d8.2b and the cumulant properties of Section l8.2.21 that 
the (A^ x N q )C 2 q, x il) ma trix can be written as 


G 

Ci q Jl) = J2 C ^ g (D + rnV(l)S(q-l) ( 8 . 8 ) 

In Equation (18.81) . 172 is the mean power of the noise per sensor, V(l ) is the 
(A x A) spatial coherence matrix of the noise for the arrangement indexed by /, 
such that Tr[V(/)] =A; Tr[.] denotes Trace; <$(.) is the Kronecker symbol; and 
the (N q x N q ) matrix C2 q , X g(0 corresponds to the 2gth-order circular cumulants 
of x g (t) for the arrangement indexed by /, which can be written as 


C 2q,x g (0 — 


Ao® 1 0A, 


L g 




£2 q,m g (0 


Aa (g>/ 0A, 


it 


L g 


L g 


(8.9) 


In Equation (18.91) C 2 q,m g (l) i s the x Pg q ) matrix of the 2gth-order circu¬ 
lar cumulants of m g (t) for the arrangement indexed by /; f corresponds to the 
conjugate transposition; 0 is the Kronecker product; and A ? (g)/ is the (A ; x P g l ) 
matrix defined by A g ® 1 =A g 0 A ? 0 ... 0 Ag, with a number of Kronecker prod¬ 
ucts equal to / — 1. Note that the independence of matrix C 2 q , Xg (0 with respect to 
frequency in the reception band requires an HO NB assumption for the sources 
(i.e., sources with a bandwidth that becomes smaller as a increases: Chevalier 


et al.. 120051) . In other words, for a given reception bandwidth and a given array 
aperture, there exists an integer such that model Equation (18.91) is valid for 
q<q 0 and not valid for q>qo. For HF or GSM links, for example, the HO NB 
assumption is generally verified up to qq = 8 or 10 (i.e., u p to a statistical order 
m = 2 q, which is equal to 16 or 20 : IChevalier et all 120051) . 
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In particular, for P statistically independent sources (G = P and P g = 1, 
1<S< G), Equation d8.8b takes the form 

p 

C 2 q,x(l) = y ^ J C 2 q,m i aq,l( 0 i)aq,l( 0 i)^ + Tj 2 V(l) 8 (q ~ 1) (8.10) 

1 

where c^.m, ^Cumlm,^),... ,m iq (t),m iq+l (t )*,... ,m i2q (/)*], with ij = i{ 1< 
J<2g), is the 2gth-order circular auto-cumulant of nti(t)\ and where a q j(0) 
is the 2gth-order extended steering vector for the DOA 0 and the arrangement /, 
defined by 


a qJ (0) = a (0)® 1 <g> a{0)*® (q ~ l) (8.11) 

Equation (18.10b shows that in the absence of noise (r) 2=0) and in the pres- 
ence of P statistically independent non-Gaussian sources (c2 q , mi 7 ^ 0,1 <i<P), 
the rank of C 2q,x(l) is equal to the number of sources P as long as the vec¬ 
tors a q ,/(»«•)( 1 <i <P) remain linearly independent. However, in the presence 
of 2gth-order correlated sources, P g > 1 for at least one group g and the rank of 
the associated matrix C2 q , Xg (0 is strictly greater than 2. This shows that, con¬ 
trary to SO source correlation matrices, the rank of C2 q ,xG) for q > 1 is, in the 
absence of noise, dependent on the correlation of the sources. To illustrate the 
algebraic structure of C 2 q ,x(0 in the presence of 2gth-order correlated sources, 
let us consider the cases q = 1 and q = 2. 

For q= 1 and 1=1, the (NxN) C2 q ,x(f) matrix corresponds to the 
well-known data covariance matrix (since the observations are zero mean) 
defined by 


G 

Rx = C 2 ,x( 1) = E{x(t)x(t)^} = + m V(l) 

8 = 1 


G Pg 

= C 2 , mg (l)[iJ]a gi a g /+ m V(l)^R s + mV(l) (8.12) 

g=li,j= 1 


where C 2 =Cum[m/(t), mdt)*] is the coefficient \i,j] of the matrix 

C , 2,m g (l); o,gi is the steering vector of source i of group g\ and R s is the corre¬ 
lation matrix of the mixed sources only. It is obvious that in the absence of 
noise, the rank of R x is equal to P, whatever the correlation of the sources, 
provided that the sources are not coherent (i.e., not fully correlated) and the 
vectors a g i( 1 <i<P) are linearly independent. 

For q = 2 and / = 1, the (. N 2 xN 2 ) C2 q , x (0 matrix corresponds to the classical 
expression of the data quadricovariance matrix, used in most papers dealing with 
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FO DOA estimation problems, defined by 

G 

q x =caa i)=y][A,®v]c 4 , m ,(i)K®v] t 

g=i 

G Pg 

=E E Q,rn s (l)[/, 7 , k, l ][«(,, ®agj* ] < 8 >] T (8.13) 

g=l ij,k,l= 1 

where C 4 w _(l)[/j,k, /] = Cum[m;(0,ra/(0*, *n/(0] is the coefficient 

[ 1 , 7 , &,/] of the matrix C 4 >w (1). It is obvious that the rank of Q x becomes 
greater than P in the presence of fourth-order correlated sources and for vectors 
^Clgj C) Clgj ]d <i,j<P g ), which are linearly independent. 

For q = 2 and 1 = 2, the (iV 2 xiV 2 ) C 2 q , x (l) matrix corresponds to an 
alternative expression of the data quadricovariance matrix (not used often) 
defined by 

G 

Qx = £ 4 ,*(2) = 'y ' [Ag 0 Ag]C 4 ?mg (2)[Ag (S>Ag] 

g=i 

G Pg 

=E E C 4 ,nig ( 2 ) Uij 1 k, /] \flgi &gj ] \flgk &#g/] (8*14) 

g=l ij,k,l= 1 

where C\ m (2)[i,j,k, l] = Cum[mi(t),mj(t),mk(t)* ,mi(t)*] is the coefficient 

\i,j, k, /] of the matrix C\, mg (2). Again, it is obvious that the rank of Q x becomes 
greater than P in the presence of fourth-order correlated sources and for vectors 
[a gi ®agj](l<iJ<P g ), which are linearly independent. 

Statistics Estimation 

In situations of practical interest, the 2gth-order statistics of the data, 
Cum [x^ (t ),.. .,xi (t),Xi +l (0*, • • are not known a priori and have 

to be estimated from L samples of data, x(k) =x(kT e ), 1 <k<L, where T e is the 
sample period. 

For zero-mean stationary observations, using the ergodicity property, an 
empirical estimator of Cum^j (t ),..., jc* (t),xt +l ( 0 *,..., Xi 2q ( 0 *L asymptot¬ 
ically unbiased and consistent, may be built from the Leonov-Shiryaev formula 
dOl ) by replacing in the latter all the moments by their empirical esti¬ 
mate. More precisely, an empirical estimate of Equation (18.41) with x^if)^ 
instead of x q is obtained by replacing in Equation (18.41) all the moments 

e {x h (ty l x h (tr 2 ...x ip {typ}(\ <p<n) by their empirical estimate, given by 

1 L 

Ef x h ( t) el x i2 (tf 2 .. .x ip (tf p )(L) = - ( k) el x i2 (k) s2 .. .x ip (kf p (8.15) 









8.3 The 2g-MUSIC Method 


Explicit expres s ions o f Equation (18.41) for n = 2q with 1 < q < 3 are given in 


Chevalier et al.l (120051) . 


However, in a radio communications context, most of the sources are 
no longer stationary but become cyclo-stationary (digital modulations). For 
zero-mean cyclo-stationary observations, the statistical matrix defined by 
Equation (18.81) becomes time dependent, noted C 2q,x(0(t), and the theory devel¬ 
oped in this chapter can be extended without any difficulties by considering 
that C2 q ,x(l) is, in this case, the temporal mean, < C 2q,x(f)(t) >* over an infinite 
interval duration, of the instantaneous statistics, C 2q,x(f)(t). In these conditions, 
using a cyclo-ergodicity property, the matrix C2 q , x (0 has to be estimated from the 
sampled data by a nonempirical estimator such as that presented in Ferreol and 
Chevalier (12000i) for a = 2. the convergence of which is shown in Dandawate 


and Giannakis (119951) . Note finally that this extension can also be applied to 


nonzero-mean cyclostationary 

sources, such as some nonlinearly digitally mod- 

ulated sources (Ferreol et al.. 2004b), provide( 

that a nonempirical statistics 

estimator, such as that presented in Ferreol et al. 

2004b) for a = 1 and in Ferreol 

et al. ( 2002 ) for q = 2 . is used. 


8.3 THE 2q-MUSIC METHOD 

The 2^-MUSIC method is described in this section. To do so, we first intro¬ 
duce some hypotheses in Section 18.3.11 and analyze the algebraic structure of 


C 2 q, x (l) matrix in Section 18.3.21 The 2^-MUSIC method is then presented in 
Section [8.3.31 Finally, Section [8. 3. 41 is devoted to the possible implementations 
of the 2g-MUSIC method. 


8.3.1 Hypotheses 

To develop the 2^-MUSIC algorithm for the arrangement / (IChevalier et al. 


20061 ) . we need some hypotheses that correspond to the following: 


HI. P g <N , 1 < g < G. 

H2. Matrix A® 1 <S)A g *® {q ~ l) has full-rank P q , 1 < g < G. 

G 

H3. P(G,q)= J2 p g q<Nq - 

8 = 1 

H4. Matrix A q j = [Ai (g)/ . .., Aq® 1 has full-rank 

P(G,q). 

For example, for ( q , /) = (2,1), hypothesis H2 reduces to A g 0 A ? * and has 
full-rank Eg 2 . In particular, for sources that are all statistically dependent (G = 1), 
P(G, q)=P q , matrix A 1 reduces to A and hypotheses HI through H4 reduce to 

HP. P <N. 

H2'. Matrix A 0/ 0A* 0( ^- /} has full-rank P q . 












































(^300^) 'hapter ] 8 High-Resolution DOA Estimation with Higher-Order Statistics 


However, for statistically independent sources (G = P ), P(G , q) =P , matrix A g 
reduces to the vector a(0 g ) and hypotheses HI through H4 reduce to 

HI". p<m._ 

H2". matrix A q j = [a q j(0i),... ,a q j(0 P )\ has full-rank P. 

8.3.2 Properties of C2q, x (0 

Although components of m g (t) are statistically dependent, the ( P g q x P g q ) matrix 
C2 q ,m g (0» which contains the 2gth-order cumulants of m g (t) for the arrangement 
indexed by /, may not be full rank for some couples (q, /). Indeed, assuming for 
example that ( q , /) = (2, 2), it is easy to verify that the maximal rank of C 4 ?mg (2) 
is 3 (not 4) for P g = 2 and 6 (not 9) for P g = 3. In this context, noting r2 q , mg (0 as 
the rank of C2q,m g (J) ( r 2q,m g 0) <we deduce from HI and H2 that matrix 
C2q,xg(l) for q > 1 also has rank r 2(??mg (/). Thus, using H4 and for g > 1, matrix 
C2q,x0) has a rank r 2 ^, x (/) equal to 


G 



(8.16) 


and such that r 2 q , x Q) < N q from H3. As matrix C 2 g >JC (Z) is Hermitian, we deduce 
that C 2 q, x (J) has r 2(?x (/) nonzero real-valued eigenvalues and N q — r 2 ^ x (/) zero 
eigenvalues for q > 1. 

8.3.3 Construction of 2q-MUSIC 

To build a MUSIC-like method from the matrix C 2 ^ x (/), for q> 1, we first 
compute the eigendecomposition of the latter, given by 



where A 2 q,s (0 is the (r 2 ^ ^(/) x r 2 ^ ?x (/)) diagonal matrix of the nonzero eigen¬ 
values of C 2 ^, x (/); U 2 q, s (l) is the (N q x r 2 ^ x (/)) unitary matrix of the associated 
eigenvectors; A 2 q,nQ) is the ((N q - r 2 q , x (0) x (N q - r 2 q , x (l))) diagonal matrix 
of the zero eigenvalues of C 2 q , x (l)\ and U 2 q ,n(l) is the (N q x (N q -r 2 q , x (l))) 
unitary matrix of the associated eigenvectors. As C 2(??x (/) is Hermitian, all 
columns of U 2 q , s Q) are orthogonal to all columns of U2 q ,n0)- Moreover, 
Span{t/ 2 ^^(/)} = Span{A^,/} when matrices C 2 q ,m g (l ), 1 <g< G are full rank; 

Span{t/ 2 ^^(/)} C Span{A^ /}, otherwise. Denoting by 0i g the DOA parameters 
of the ith source in the gth group, it can be easily verified that, in all cases, the 
vector a q j(0i g ) always belongs to Span{t/ 2<? ^(/)}, called the 2gth-order signal 
subspace for the arrangement /. 

Consequently, all vectors {a q j(0i g ), l<i<P g ,l<g<G} are orthogonal to 
the columns of U 2 q ,n 0 ) and are solutions to 


aq,l(0yn 2 q,n(l)aq,l(0)= 0 


where n 2<? ,„(/) = U 2 q , n {l)U 2 q ,n(rf ■ 


(8.18) 




8.3 The 2g-MUSIC Method 


Equation (18.181) corresponds to the heart of the 2g-MUSIC algorithm for the 
arrangement / (IChevalier et all 120061) . which corresponds to the well-known 
2-MUSIC pseudo-spectrum for ( q , /) = (1,1). Because, for an array with identi¬ 
cal sensors in the absence of mutual coupling between the sensors, the complex 
gain of the sensors, f(0, /?), has been integrated in the complex envelope of the 
associated source, we deduce from Equations (18.31 ) and (18.111) that a q j(0)^a q j(0) 
does not depend on 0. However, in the presence of mutual coupling between sen¬ 
sors, a situation for which the array manifold a(0) has to be estimated by array 
calibration, a q j(0)^a q j(0) may be a function of 0. In this case, in the absence of 
sources (i.e., when Ii2q,n(0 = Ia^), to ensure a constant value, independent of 
parameters 0 , of the left side of Equation (18.181) . it is necessary to normalize the 
latter by the quantity a q j(0)^a q j(0 ). Moreover, in practical situations, matrix 
U2q,s(0 has to be estimated from the observations, and the DOAs of the sources 
may be found by searching for the minima of the left side of Equation d8.18 ). 

The steps of the 2^-MUSIC algorithm for the arrangement / are summarized 
as follows: 


Step 1. Estimation, C2 q ,x(l)> of the matrix C2q,x(0 from L snapshots x(k), 
1 < k < L, using a suitable estimator of the 2gth-order cumulants of observations. 

/V 

Step 2. Eigendecomposition of the matrix C2 q , x (J) and extraction of an esti- 

/V 

mate, U2 q ,n(l ), of the U2 q , n Q) matrix. This step may involve rank determination 
in cases where the number of sources and/or their mutual statistical dependence 
are not known a priori. 

Step 3. Computation of the estimated pseudo-spectrum 


P 2q —Music {1) (0) 


A 


a,.,(0)ta,.,(0) 


(8.19) 


over a suitably chosen grid, and search for the local minima (including 
interpolation at each local minimum), where Il 2 q ,n(.D = U2 q ,nW2q.n(.tf- 


Sometimes the number of sources P is known, such that P <N, but their 
statistical dependence is not. In this case, r2 q , x 0) < P q > and a conservative 
approach is to use only the (N q —P q ) eigenvectors associated with the smallest 
eigenvalues to build which implicitly assumes the statistical depen¬ 

dence of all sources. Note finally that similar to 2-MUSIC, 2^-MUSIC cannot 
handle perfectly coherent sources. Indeed, in such a case the associated vec¬ 
tors a q j(0i g ) no longer belong to Span{t/ 2 ^^(/)} and the corresponding sources 
become indistinguishable to the algorithm. 


8.3.4 Implementations 

/V 

We note by P the estimate of the number of sources P obtained from step 2 of 

/V 

the 2^-MUSIC algorithm. In the standard version of 2^-MUSIC, the P local 
minima of Equation (18.191) are estimated in parallel. However, this parallel 
search gives rise to interaction between the sources when the spatial correla¬ 
tion coefficient between at least two of the sources is high, which occurs in 



























(^302^) Chapter ] 8 High-Resolution DOA Estimation with Higher-Order Statistics 


particular when the sources are close to each other or when P approaches the 
number of sensors N. This interaction between sources generally decreases DOA 
estimation performance, especially the performance of the weakest sources. To 
decrease interaction between the sources in the pseudo-spectrum (18.19b . and then 
to increase the quality of their DOA estimation, one may replace the parallel 
search of local minima in Equation (18.19b with a sequential search. 

A sequential search consists of finding the DOA number p after having 
removed in the pseudo-spectrum (18.19b . by some deflation scheme, the con¬ 
tribution of DOA numbers 1 to p — 1 already estimate d. Such sequential 
schem e s were proposed for imple mentation of 2 - MUSI C in lMosher and Leahv 
(Il999ll . lOh and Uni (1 19931) . and IStoica et al.1 d 1995b . and have given rise 
to S-MUSIC (Sequ ential MUSIC: bh and UnUl993l IES-MUSIC (Improved 


Sequential MUSIC: IStoica et al 


19951. and RAP-M USIC (Recursively Applied 


and Projected MUSIC); Mosher and Leahv . 19991 respectively. More recently. 


sequen tial versions of t h e 2q-M USIC algorithm were proposed in lAlbera et al. 


( 2008b and Birot et al. ( 2007b . giving rise to the 2g-RAP-MUSIC and 2q-D- 
MUSIC (2 q Deflated MUSIC) algorithms. We limit the following discussion to 
2g-D-MUSIC. 

Let us assume, for p > 2, that p — 1 DOAs, 0( for 1 <i<p— 1, have already 
been found, and let us denote byA q j(0\ :p -i) the matrix of the associated vectors 
a q j(0i ), defined by A q j(0\ :p -i) = [a q j(0 1 ),.. .,a q j(0 p - 1 )]. We then define, for 
l <p<P, the orthogonal projector, P q j(0\- p -\) , on the space that is orthogonal 
to Span{A^/(0i ;p _i)} by 


Pq,lifi\\p-\)' L — In C 1 - A q> i(0i:p-i) A q> i(0\ :p -i) J A q> i(0i :p -i) A q> i(0i :p -i) 


t 


- 1—1 




( 8 . 20 ) 


For p = 1, we assume that P q j(0 1 : o) _L — I N q - For p > 1, it is straightforward 
to verify that the projected vectors P q j{0\-. p -i)^a q j{0i) contained in the sig¬ 
nal subspace of the matrix Pq,i(0i :p -i)' L C2 q , x (l)Pq,i(0\:p-i) 1 ' correspond to 
the projected vectors associated with the P-p -\-1 DOAs that have yet to be 
found, while the others have been removed by the projection operation. As a 
consequence, the implementation of the 2^-MUSIC algorithm from the matrix 
Pq,l(0\:p-i) ± C2q,x(l)Pq,l(0i:p-i) 1 ' and the 2gth-order projected array mani¬ 
fold P q j(0\- p -\)^a q j(0) allows us to find the DOA number p with a limited 
interaction with sources 1 to p — 1. 

/V 

The steps of the 2g-D-MUSIC algorithm for the estimation of 0 p (l <p < p) 

/V 

and for arrangement l are summarized here from the estimation, 0;, of the DOAs 

0,(1 <i<P~l): 

/V 

Step 1. Estimation, C2 q ,x(0, of the matrix C 2 q ,x(0 from L snapshots, x(k ), 
1 < k < L, using a suitable estimator of the 2gth-order cumulants of observations. 

Step 2. Initialization of parameters: p = 1 and P q j(0\ : o) = I N q . 




















































8.4 2g-MUSIC Identifiability 


Step 3. Eigendecomposition of the matrix P q j(0i-p-i)- L C 2q , x (l)P q j 

/V | /V 

(0, and extraction of an estimate, U2q,n,p-\(f) > of the unitary matrix of 
the noise subspace eigenvectors. This step may involve rank determination in 
cases where the number of sources and/or their mutual statistical dependence 
are not known a priori. 


Step 4. Computation of the estimated pseudo-spectrum 


P2q—D—Music,p{l) — 




( 8 . 21 ) 


over a suitably chosen grid, and search for the absolute minimum (including inter¬ 
polation around this minimum), where TL2q, n ,p-i 

The position of this minimum corresponds to 0 p . 


Step 5. If p < P , incrementation of p : p =p + 1 . 

Step 6. Computation of A q j(0i :p -i) = \a q j(0i ),... ,a q j(0 p -i)] and the 
orthogonal projector, P q j(0 \-. p -\)-*-, defined by Equation (18.201) with 0\ :p -\ 
instead of 0\ :p -\. To reduce the computational cost of P q ,i(0 i :p -i) , a recur- 

^ i 

sive computation of the latter as a function of is described in Birot 


et al. (I2007h and may be preferred. 
Step 7. Reiteration from step 3. 


Let us note that in practical situations, the p— 1 source DO A estimates 0;, 
1 <i<P~ 1, have errors and that a contribution of these sources remains present 
after the orthogonal projection (18.201) . However, despite these errors, simulations 
show that when resolution is required to separate sources, 2^-D-MUSIC still 
performs better than 2^-MUSIC. To our knowledge, no analytical proof of this 
result is currently available. 


8.4 2q-MUSIC IDENTIFIABILITY 

Because the maximal number of sources that can be processed by the 2g-MUSIC 
algorithm is obtained when all sources are statistically independent, we limit the 
identifiability analysis (i.e., the analysis of the maximal number of sources that 
can be processed by the 2^-MUSIC algorithm) to the case of statistically inde¬ 
pendent sources. In such a situation, hypotheses HI through H4 reduce to HI" 
and H2", respectively. For this reason, we fir st describe, in Section l8'.4.11 the HO 
virtual array concept (IChevalier et all 2005h . valid for statistically independent 
sources, which generalize s to an arbitrary even order th e F O virtual array con¬ 
cept in 


tially introduced in Chevalier and Ferreol ( 1999 ) and Dogan and MendeJ 


( 1995al) . Some properties of HO VA, presented in Section 18.4.21 then allow 


us to show geometrically the increasing resolution of 2g-MUSIC method as q 
increases. Finally, the identifiability issue of 2g-MUSIC method is described in 
Section IS. 4.3 1 
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8.4.1 Higher-Order Virtual Array 

Assuming no noise, statistically independent non-Gaussian sources, and no 
modeling errors, particularly no mutual coupling between sensors, the matrices 
C2q,x(l ) and R x = C 2,jc( 1)» defined by Equation (18.10b . have the same algebraic 
structure, where the auto-cumulant C2q,mi and the vector a q j($i) have for C2 q , x (0, 
the place occupied by the power C2, m i and the steering vector a(0i), respectively, 
for R x . Thus, for the 2gth-order array-processing methods exploiting Equation 


(18.101) . the (N q x 1) vector a q j(0f) can be considered as the virtual steering 
vector of the source i for the true array of N identical sensors with coordinates 
(x n ,y n ,Zn), 1 <n<N.Th.eN q components of the vector a q j($i) correspond to the 
quantities 

aid (0i)a k2 (0i).. .a kl (Oi)a k/+i (0i)*a k/+2 (0i)* .. .a kl/ (0i)*(l <kj <N, 1 </<£/) 

where a^iPi) is the component h of vector Numbering, in a natural way, 
the N q values of the g-uplet (k\,k 2 ,... ,ki, ki+ 1 , ...,k q ) by associating to the 
latter the integer K defined by 


K = J2 Nq ~ j (kj- 1) + 1 

7=1 


( 8 . 22 ) 


and using Equation (18.31) in the components of a q j(0i ), we find that the K th 
component of the vector a q j($i), noted a q j(0i)K , takes the form 


a q ,i(0i)K = ex p \j2jt 


q 1 


E x % - E Xk, +U cos(9i)cos((pi) 


J =i 


LI— 1 


l q— 1 

+ | J2 yk i ~ E >’*'+« ) sm(.0i)co*(<Pi) 

J= l 


(8.23) 


ii= i 


A 


/ / 9-1 \ 

+ E^ _ E^+x sin ^ ;) 

V./=i «=i / 

Comparing Equation (18.23b to aK(Qi) defined by Equation (18.3b with n = K 
and f n (0i, = 1, we deduce that the vector a q j(0i) can also be considered as the 
true steering vector of source i for the VA of N q identical virtual sensors (VSs), 
with coordinates {x l klMJ(q ,y l kuk , 2 ... kq ,z[ uk2 . kq )(\Skj<N, 1 <j<q) given by 

( x /:i ,k 2 ...k q ' Vki ,k 2 ...k q ’ ?-k\ ,k 2 ...k q 'j 

( l q—l l q—l l q—l 

y - t> /+ „, y^jkj - y^>A+»»^ 

7=1 w= 1 7=1 ii= 1 7=1 m= 1 


(8.24) 














8.4 2g-MUSIC Identifiability 


Equation (18.251) introduces, in a very simple way and for arrays with identical 
sensors, the VA concept for the 2gth-order DOA estimation problem for the 
arrangement C2 q , x (l)- 

Thus, we can consider that the 2gth-order DOA estimation problem of P 
statistically independent NB non-Gaussian sources from a given array of N 
identical sensors with coordinates (x n , y n , z n ), 1 < n < N, is, for the arrangement 
Clq,x (0, similar to an SO DOA estimation problem for which these P statistically 
independent NB sources impinge, with the virtual powers C2q, mi (1 5 i < P), on a 
VA of N q identical VSs having the coordinates (x[ ife , , y[ lk2 kq , z[ lkl k ), 1 < 

kj < N for 1 <7 < q, defined by Equation (18.241) . Thus, HO array processing may 
be used to replace sensors and hardware and so decrease the overall cost of a 
given system. 


8.4.2 Properties of Higher-Order Virtual Array 

We naturally deduce from the previous theory that, for given values of (q, /), 
the 3-dB beamwidth of the associated 2gth-order VA controls the resolution 
power of 2gth-order DOA estimation methods for the arrangement / and for 
a finite observation duration. For further insight into this resolution power, 
let us compute the spatial correlation coefficient, ctQj (q, 0 (0 < \ a o,e p (q, 01 5 
1), of DO As 0 and 0 p for the HO VA associated with parameters ( q,l ), 
defined by 


^6, Or, ’ 0 


Gq,l(0)^Mq, l (Qp) 


(8.25) 


[a q A0)' a q A0)] ll2 [a q j(0 P )%,i(0p)] 1/2 

It is straightforward to verify (IChevalier et all 2005 1 that the modulus of 


ao,o (q, l) can be written as 


\<Xe,8M,l)\ = \<X8,8r,0, l)l 9 


(8.26) 


where ae,o p ( 1,1) is the spatial correlation coefficient of DO As 0 and 0 P for the 
true array of N sensors, defined by 

a(0)^a(0 p ) 


«0,0n(l» 1) 


(8.27) 


[a(0ya(0 )] i/2 [a(0 p ^a(0 p )] l/2 

From Equation (18.261) . we deduce that | ot$j (q, l)\ is independent of / and is a 
decreasing function of q, showing off the increasing reso l ution of HO VA as q 
increases. In particular, it was shown in Chevalier et al. (12005 ) that the 3-dB 

beamwidth of the 2gth-order VA, is equal to =0.84$3dB, 0.76^3dB, 
and 0.71 %dB for <7 = 2,3, and 4, respectively, where ^ 3 dB corresponds to the 
3-dB beamwidth of the true array. 

Another interesting re sult deduced from Equation (18.261) is that rank-1 ambi¬ 
guities (or grating lobes; IComptonL 119881) of the 2 gth-order VA coincide with 
those of the true array whatever the values of q and /, since the directions 0 ^0 p 
giving rise to \cto,o p (#, Z) | = 1 are exactly the ones that give rise to \oto,o p (1 ,1) I = 1 • 



























(^306^) chapter | 8 High-Resolution DOA Estimation with Higher-Order Statistics 


A consequence of this result is that a necessary and sufficient condition to obtain 
the 2gth-order VA without any rank-1 ambiguities is that the true array of N 
sensors have no rank-1 ambiguities. 


8.4.3 Identifiability of the 2q-MUSIC Method 

Following the developments of Section 18.31 we deduce that the 2^-MUSIC 
algorithm for the arrangement indexed by / is able to estimate the DOAs of P 
noncoherent sources from an array of N identical sensors, provided that hypothe¬ 
ses HI through H4 are verified and that the DOAs of the sources are the only 
solutions to Equation d8.18 ). 


Analysis 


As for statistically independent sources, hypotheses HI through H4 reduce to HI" 
and H2", the identifiability analysis of the 2^-MUSIC method mainly reduces to 
analysis of the conditions under which H2" is verified. Thus, because some of the 
N q VSs of the associated 2gth-order VA may coincide, we note N 2q (N 2q < N q ) as 
the number of different VSs of the VA associated with the parameters ( q , /). This 
number, N 2 , is directly related to the geometry of the true array of N sensors 

as shown by Equation (18.241) . In these conditions, N q —N l 2q components of all 

vectors a q j(0) are redundant, providing no information. Therefore, m-N[ q 

rows of the A q j matrix provide no information and are linear combinations 
of the other rows, which means that the rank of A q j cannot be greater than 
N l lq . We then deduce that the A q j matrix may have a rank equal to P only 

if P<N[ q - 

Conversely, for a 2gth-order VA without ambiguities up to order N 2 , P 

sources coming from P different directions generate an A q j matrix with a full- 
rank P as long as P < N 2q . Thus, the maximal number of statistically independent 

sources able to generate a matrix A q , / with rank P is N 2q . However, when P = N 2q , 


an arbitrary vector a q j(0) associated with an arbitrary DOA 0 is necessarily 
a linear combination of the source steering vectors a q j(0i), 1 <i<N l lq , since 

matrix A q j cannot have a rank greater than N 2q . All DOAs 0 are thus solu¬ 
tions to Equation (18.18b . which does not allow the sources’ DOA estimation. 
A necessary condition for the DOAs of the sources to be the only solutions 
to Equation (18.18b . then, is that P < N 2 and this condition becomes sufficient 
for HO VA with no ambiguities. From the previous results we deduce that the 
2^-MUSIC algorithm for the arrangement / is able to process up to N 2q — 1 
sources. 

This result means that the processing capacity of 2^-MUSIC depends on the 
geometry of the true array of N sensors and on the parameters q and /, which 
show in particular the existence of an optimal a rrangement of the 2c/th- order data 
statistics for a given value of q. It is shown in lChevalier et al. ( 2005 ) that for a 
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TABLE 8.1 /V ma x[2q, /] as a Function of N for Several Values of q 
and / and for Arrays with Identical Sensors 

m = 2q 

1 

WmaxPq, /] 

4 

2 

N(N +1 )/2 

(q = 2) 

1 

N 2 -N +1 

6 

3 

N!/[6(N-3)!] + N(N-1) + N 

(q = 3) 

2 

N!/[2(N-3)!] + N(N-1) + N 

8 

4 

N!/[24(N-4)!] + N!/[2(N-3)!] + 1.5N(N-1) + N 

(q = 4) 

3 

N!/[6(N-4)!] + N!/(N-3)! + 1.5N(N-1) + N 


2 

N!/[4(N-4)!] + N!/(N-3)!+2N(N-1) + 1 

Source: From\Chevalier et ali\2003L © IEEE, witki permission. 


given value of q , the optimal arrangement is associated with the integer /, which 
minimizes the quantity \2l — q\, finally generating steering vectors a q j(O p ) for 
which the number of conjugate vectors is the least different from the number of 
nonconjugate vectors. In particular, for q = 2 this corresponds to / = 1 (i.e., to 
steering vectors of the form [aifii) (g)«(#/)*]). 


Finally, analysis of the symmetries of the HO VA shows (I Chevalier et al 
20051) that, for given values of N , q , and /, N l 2q is necessarily upper-bounded by 


a quantity, noted N mSLX [2q, /], such that N max [2q , l]<N q . Table 18.ll illustrates, 
for arrays with identical sensors, the expression of N mSLX [2q , /] as a function of 
N for 2 < q < 4 and several values of /. In fact, this upper bound corresponds 
to N l 2q in most cases of array geometries with no particular symmetries (e.g., 
a unifor m circular array with a number of sensors corresponding to a prime 
number: Ichevalier et al. . 2005 ). but it cannot be reached by N 2q for arrays with 
particular symmetries (e.g., a uniform linear array (ULA), with an HO VA that 
is analyzed in the next section). 


Example of the Uniform Linear Array 

For a ULA, it is always possible to choose a coordinate system in which the sensor 
n has the coordinates (x n = nd, 0,0), 1 <n<N, where d is the inter-element 
spacing. As a consequence, the VSs of the HO VA for parameters ( q , /) are, from 
Equation (18.24b . at coordinates 



1 V 1 7 l 

k\k,2...kn’ Xk\k2...k c r <J k\k2...k l 


q 


) 


l q—l 

YXj-y,ki+u F,0,0 

J= 1 u = 1 


(8.28) 
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for 1 < kj < N and 1 <j < q. This shows that the associated HO VA is also a ULA 
with the same inter-element spacing, whatever the arrangement /. Moreover, for 
given values of q , /, and N , the minimum and maximum values of ^ k , noted 

4,min and 4,max’ respectively, are given by 

4min = [/-(«- W* = [Z(l + AO - ?Afl<* (8.29) 

4, max = [ZA1- 0 -Z)](Z = [/(l + N) -q]d ,0 ™ 


and the number of different VSs, N l 2a , of the associated VA, easily deduced from 
Equations (18.29b and (18.30b . is given by 


N l 2 q = (4 max -4min)/^+ 1 = qN - (q - l) = q(N - l) + l (8.31) 


This is independent of /, and means that, for given values of q and V, the number 
of VSs is independent of the chosen arrangement /. 

Put another way, in terms of processing power, for a given value of q and 
because of the symmetries of the array, all arrangements C2q, x Q) are equivalent 
for a ULA. The 2g-MUSIC algorithm is thus able to process up to N 1 2cj — 1 = 

q(N — 1) statistically independent non-Gaussian sources from a ULA of N > 1 
sensors, which is an increasing function of N and q and is strictly lower than 
Mnax[2g, /] — 1 as soon as q > 1 and N > 2. 

To illustrate the previous results, Ligure 18.21 shows the HO VA, with nine 
VSs, of a ULA of five sensors for which d = X/ 2, together with the order of 
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FIGURE 8.2 4th-order VA of a UL A of five sensors with the order of multiplicities of the VSs 
0 Source: From lChevalier et al.Ll2005L © IEEE, with permission.) 
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6 (degrees) 

FIGURE 8.3 VA pattern for q = 1, 2 , 3,4, ULA with five sensors, d = X/ 2; pointing direction: 
0° ( Source: From lChevalier et alll2005[ © IEEE, with permission.) 


multiplicity of the VSs (i.e., the number of VSs having the same coordinates), 
with the v- and y-axes normalized by the wavelength A. 

To complete these results and to illustrate Equation (18.261) in relation to the 
increasing resolution of HO VA as q increases, Figure l83l shows the array pattern 
(the normalized inner product of two associated steering vectors) of HO VA 
associated with a ULA of five sensors equi-spaced half a wavelength apart for q = 
1, 2, 3, and 4, and for a pointing direction equal to 0°. Note the decreasing 3-dB 
beamwidth and sidelobe level of the array pattern as q increases in proportions 
given by Equation (18.261) . 


Example of the Uniform Circular Array 

For a UCA of N sensors, it is always possible to choose a coordinate system in 
which the sensor n has the coordinates (R cos (p n ,R sin (f> n , 0) 1 < n < N, where R 
is the radius of the array and where <t> n = (n— 1 )2tv/N. For g > 2, the analytical 
computation of the VA (introduced in IChevalier et all 120051) is a tedious task. 
It can be verified that, for given values of q and /, the number of different VSs, 
N 2q , of the associated VA corresponds to the upper bound, V max [2g, /], when N 
is a prime number. In this case, the 2g-MUSIC method is able to process up to 
N 2q — 1 = Af max [2g, /] — 1 statistically independent non-Gaussian sources from a 


UCA of N sensors, where some values of V max [2g, /] are presented in Table 1 8.1 
Otherwise, N 1 2q remains smaller than V max [2g, /]. 

To illustrate the previous result, Figures \&A \ and 1531 show the VA of a UCA 
of five sensors for which R = 0.8A, together with the order of multiplicity of the 
VSs, for (q, l) = (2, 2) and (q, /) = (2,1), respectively. Note the greater value of 
N 2q , equal to 21, for the optimal arrangement 1=1, whereas N 2q = 15 for / = 2. 
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To complete these results and to illustrate Equation (18.261) for a UCA, 
Figure ES] shows the array pattern of HO VA associated with a UCA of five 
sensors such that R = 0.8 A, for q = 1,2, 3, and 4, and for a pointing direction 
equal to 0°. Note the decreasing 3-dB beamwidth and sidelobe level of the array 
pattern as q increases in proportions given by Equation (18.26 ). 
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FIGURE 8.6 VApattern for q = 1,2, 3,4, UCAwith five sensors, R = 0.81; pointing direction: 0°. 


8.5 2q-MUSIC PERFORMANCE 

This section describes the performance of the 2g-MUSIC method in the presence 
of modeling errors. To this aim, in Section 15. 5. II we present the main source of 
performance degradation in operational context—that is, the finite-sample effect 
and modeling errors. In practical situations, modeling errors are generally the 
dominant source of performance degradation. For this reason, an observation 
model including the modeling errors is introduced in Section l8.5.21 Performance 
of 2-MUSIC and 2g-MUSIC for q > 1 in the presence of modeling errors are then 
presented in Sections 18.5.31 and 18.5.41 respectively. Finally, results of previous 
sections are illustrated in Section [8. 5. 51 


8.5.1 Finite-Sample Effect and Modeling Errors 

In operational contexts, for given choices of sensor array and algorithm, the 
performance of the latter is mainly controlled by both finite-sample effect, due 
to the finite observation duration, and modeling errors such as array calibration 
errors or phase and amplitude residual mismatches between reception chains. 


Younan (119951) . and 


tudied analytically in 

Cardoso and M 

oulines 

(1995 

LFan and 

Yuen and Friedlandei (1996. 

1998k 

Cardoso and Moulines 


dl995 ) mainly considered FO-contracted MUSIC methods, whereas Fan and 


Younan 19951) dealt with the harm onic retrieval problem for N = 1. Moreover, 
Yuen and Friedlandei ( 1996 . 1998b analyzed the sensitivity of both 4-ESPRIT 
and FO virtual ESPRIT for far-field and near-fi eld sources, respectively. Most 
of these works ( Cardoso and Moulinesl 19951 in particular) show the higher 
sensitivity to finite-sample effects of FO versus SO methods, especially for weak 
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sources. Nevertheless, to our knowledge, the sensitivity of 2g-MUSIC to finite- 
sample effect has not yet been studied analytically for q > 1, which prevents us 
from describing such results in the next sections. 

In contrast, the sensitivity of HO DOA estimation methods with respect to 
modeling e rrors, omnipresent in all operational contexts, was analyzed in Liu 


and M endel ( 1999b ) for the FO virtual ESPRIT algorithm and in lChevalier et al 


( 20061) for the 2q-M USIC algorithm. We pre sent here some modified results of the 
analysis described in lChevalier et al J (120061) . showing in particular the influence 
of parameters q and / on robustness with respect to modeling errors of 2^-MUSIC. 


8.5.2 Model and Problem Formulation 


In the presence of modeling errors, the observation vector defined by Equation 
(18.21) becomes 

p 

x(t) y ^mj{t)a{0i)-\-v(t) =Am(t) +v(7) (8.32) 

i= 1 

where A is the (N xP) matrix of the corrupted source steering vectors a (Of), 
(1 < i < P), such that 

a(0i)=a(0i)+e(0i) (8.33) 

where e(0/) is the modeling error vector of the source /, whereas vector a(fii) is 
such that =N. 

From Equation (18.331) we deduce that 

A=A+E=A(E ) (8.34) 


where E is the (N x P) matrix of the error vectors e(0/), (1 < i < P). Note that the 
model (18.331) is well suited for any distortion (mutual coupling between sensors 


mismatches between reception chains, sensor position errors) (IFerreol et al. 


20061) . To simplify the notations, we note in the following that «/ = «(#/),«/ = 
a(0i), and e/ = e(0/). We also assume that a very large number of sampled 
observation vectors, x(kT e ), have been collected, resulting in a perfect measure¬ 
ment of the matrix C2q,x(f) given by Equation (18.81) but where C2q, X g(f) is now 
given by 


C2q, Xg d) = 


' Ap l 






^ 2q , m g (0 


0/ 7 |t 




®A 


g 


(8.35) 


where A g is the (N yp g) submatrix of A corresponding to the gth group of 
sources. 

Assuming that the number of sources P and their correlation properties are 

/V 

known, the problem considered in this section is to find the P DOA 0/(1 <i< p), 
minimizing the left side of Equation (18.181) . or the following criterion: 


Pi q ,l(0,E) 4 a q j (0) f n 2q ,n(l) ( E)a qJ (0 ) 


( 8 . 36 ) 
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In Equation ESJ, n 2 q ,n(l)(E) = U 2 q Al)(E)U 2 q ,n(l)(E)K where U 2q ,„(/)(£) 
is the (N q x (N q — r 2 <? , x (/))) unitary matrix of the eigenvectors of C 2 q,xd) asso¬ 
ciated with the (N q — r2 q ,x(0) zero eigenvalues of C 2 q,xd)- In the absence of 
modeling errors (£’ = 0), the P minima of Equation (18.361) are P zeros corre¬ 
sponding to the DOA of the sources, 0/(1 < / < £). However, in the presence of 

modeling errors (E t^O), the criterion Equation (18.361) is no longer zero for the 

/\ 

DOA of the sources but presents P local minima for directions 0/(1 < i < £), dif- 
ferent from 0/(1 < i < P). The variable A 0/ = 0/ — 0/ defines the estimation error 
in the DOA of source i. To simplify the mathematical developments, we limit 
our analysis to a mono-dimensional DOA estimation problem for which 0/, 0/ 
and A0/, (1 <i<P ) are scalar quantities. 

Using a first-order Taylor expansion of the first partial derivative, P 2 q,ii0, E), 
of P 2 q j( 0 ,E) with respect to 0 around 0 = 0/, and exploiting the fact that 

• A 

P 2 qj( 6 i,E) = 0, we obtain an approximated expression of A0/ given by 


AO; ^ - 


P2q,l(0j,E) 

P2q,l(0i,E ) 


(8.37) 


• • 

where P 2 q,i( 0 , E) corresponds to the second derivative of P 2 q,i(0, E) with respect 
to 0 at 0. Using Equation (18.361) . /%,/(0/,£) and P 2 q j( 0 i,E) can be written as 


Plq, l (Pi , E) = 2Re \dq , / (0/)' 1 n 2q,n ( 0 (E)d q , / (0/) ] 


(8.38) 


P2qj(0i,E)=2RQ[dqj(0i) t U2q,n(l)(E)aqj(0i)]+2aqj(0iyU 2 q,n(l)(E)(lqj(0i) 

(8.39) 


f 


where Re[.] indicates the real part and a q j(0i) and a q j(0i) indicate the first and 
second derivative of a q j( 6 ) with respect to 0 at 0 = 0/, respectively. Consider¬ 
ing that £ is a random matrix, the quantities A0/(1 <i<P) become random 
variables. 

The purpose of this section is to compute the root mean-square error 
(RMSE) of 0/, defined by RMSE/ = (E[A0/ 2 ]) 1 / 2 , as a function of q , /, and 
the statistics of £. 


8.5.3 Solution for q = 1 

The performance of the 2-MUSIC method is computed in this section through a 
first- and a second-order expansion of n 2 , w (l)(£) = n 2 ,^(£), respectively. 


First-Order Expansion ofT\2, n (E) 

/V 

The computation o f the RMSE of the previous DOA estimate s, 0/(1 <i < P), 
were considered in Friedlander ( 1990h. Li and Vaccaro ( 1992). and Swindle- 
hurst and Kailath ( 1992 ) for q= 1, that is, for the 2-MUSIC algorithm and 
from an expansion of Il 2 ,«(£) at the first order i n £ around £ = 0. M ore pre¬ 
cisely, under these assumptions it was shown in iFriedlander ( 1990b that A0/ 
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and E[A 0 2 ] are given by 


AO; & - 


Re[g,- f ^] 


(8.40) 


and 


E[A 9i 2 ]*- 


21 1 b^ Reibj +Re[bi T C e *bi] 


(«/A) 


(8.41) 


where a, = a(6(); b t = II 2> „(£ = ())«,; /?<, = E 


g; g; 1 ' 


and Cgj = E 




T 


Second-Order Expansion of El2,n(£) 

The first-order expansion of II 2>W (Z£) around E = 0 generates results in agree¬ 
ment with the simulations as long as the sources are not in the resolution limit. 
However, for sources in the resolution limit, results generated by the first-order 
expansion do not agree with the simulations, and an SO e xpansion of II i.n(E) 
around E = 0 has to be considered. This was done recently in oerreol et al.l (120061) 
where the associated results were in complete agreement with simulations, even 
for high spatial correlation between sources. M ore precisely, using an SO expan¬ 
sion of Tl 2 ,n(E) around E = 0, it was shown in lFerreol et al.1 (120061) that A0* and 
E[A0i 2 ] are given by 






E[A 0i 2 ] 


Tr[gf 2 IC (4) ] 

Tr [Qt 2 Rs {4) ] 


(8.42) 

(8.43) 


where Tr[.] denotes trace; e is the ((2AP+1) x 1) vector defined by e = 
[1, vec(Zi) T , vec(E)^] T ; vec(E) = [e \ T ,..., ep T ] T ; is the {{2NP-\- 1) 2 x 
(2AP+1) 2 ) matrix defined by R s ^ = E|; and Q t and Q t are the 
((2 NP+ 1) x (2 NP+ 1)) matrices defined by 


Qi — Qicii^di) + Q(di,cii) (8.44) 

Qi = Q(ai,di) + QC&uai) + 2 Q{a u di) (8.45) 


where =a(0i). In Equations (18.441) and (18.45b . for the (Ax 1) vectors u and v, 
Q(u, v) is the ((2 ,NP + 1) x (2 NP+ 1)) matrix defined by 

1 « ° T \ 

Q(u,v)= — #21 022 023 (8.46) 

\ 0 032 033/ 


where a = v^n 2?n (ii =0)«, # 12 and # 21 are the (NP x 1) vectors defined by 
# 12 = 0(«,v) and # 21 =0 (v,m), respectively, with 0(«, v) = [(A T A) -1 A T w]* (g) 
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[n 2 ,„(£' = 0)v];0isthe (NP x 1 ) zero vector; and Q 2 2 , 023’ 032’ an d 633 are 
(NP x NP) matrices defined by 

Q 2 2 = f[(A^A)~ l A\ (A t A)“ 1 A t , n 2 ,„(£ = 0)] (8.47) 

e 23 ^f[(A t A)- 1 A t ,n 2 ,„(£ = 0),A(A t A)- 1 ]n ;7 (8.48) 

fi 32 4 n/ f[U 2 ,„(E = 0), (A t A) _1 A t ] (8.49) 

< 2 33 = -n p t f[n 2 ,„(£=0),n 2i „(£=0),(A t A)- 1 ]n p (8.50) 


respectively, where \/r[X, Y, Z] = [(Xv)*(Ym) t ] ®Z and n p is the (jVPxM 1 ) 
permutation matrix such that vec (E t ) = Hpvec(E). 


8.5.4 Solution for q > 1 


The RMSEs of the source DOA estimates were computed in the previous section 
from Equations (18.371) to (18.39b . with q = l= 1 and either a first-order or an SO 
Taylor expansion of II 2 , n (E) =1 ^—A{E)(A{E)^A{E))~ l A{E)^ around E — 0, 
where A (E) = [a \,..., a p ] =A +E. Assuming statistically independent sources 
and considering arbitrary values of q and /, the projector on the noise subspace, 
H2q,n(l)(E)i takes the form 


n 2? , „(/)(£) 4 V -A q j(E)(A q j(E) Jr A q j(E))~ l Aq i i(E) 


t 


where A q j(E) = \a q j{6\), ..., %/(£/>)] and a q j(8i) =a t ® l ®a 

The use of an SO Taylor expansion of II 2 ^, w (/)(/£) around E = 0, for the 
computation of the RMSE of the sources, requires an SO Taylor expansion of 

A q i(E) around E = 0, which dramatically complicates the computations and 
prevents us from exploiting the results of Section in'. 5. 31 For this reason, we limit 


.*<g>(#—/) 


the Taylor expansion of A q j(E) around E — 0 to the first order, which finally also 
imposes the lim itation to the first orde r of the Taylor expansion of II 2 ^ (/)(£) 
around E = 0. In Chevalier et al. d2006 ). an SO Taylor expansion of T\ 2 q , n (l)(E) 

around E — 0 was used but only from a first-order expansion of A q j(E) around 
E = 0, which is not completely rigorous but still gives good results in agreement 
with the simulations. 

Considering the first-order expansion of A q j(E) around E = 0, we obtain 

Aqj(E) *Aqj -\~E q i =A qJ (E qJ ) (8.51) 

whereAgj is defined by H2" and where it is easy to show that E q j is the ( N q x p ) 
matrix defined by E q j = [e q j(0 \),..., e q j(8p)], where e q j(9i) is defined by 


l=i 


e q ,i(0i) = '^a i 




Ci 


a 


Ml-u- 1) (g 


u =0 


q—l—1 

+ E a ‘ 

u= 0 


<g>Z 


a; 


*(g)M 


e\ 


* 


a 


*<8>(q—l—u— 1) 


(8.52) 
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with the convention a/®° = l. In these conditions, the RMS error of the 
source DOA estimates can be computed for arbitrary values of q and / from 
Equations (18.371) to (18.391) and a first-order Taylor expansion of II 2 q, n (f) (E) 




1124 , 7 !(0(£$,/) - I N q -A q j(E q j)(A q j(E q j)'A q j(E q j)) l A q j(E qj y around 
E q j = 0. This can be done by replacing N with N q ,E with E q j,A with A q j, 
and a (Of) with a q j(0i) in (18.411) . 


8.5.5 Illustrations 

After the description of some hypotheses, the previous results are illustrated in 
this section for both the mono- and bisource cases. 


Hypotheses 

To illustrate the previous results, we assume that P statistically independent non- 
Gaussian sources are received by a UCA of N = 5 omnidirectional sensors with 
a radius equal to half the wavelength. The DOA of source 1 is 0\ = 100°. The 
modeling error vectors e/(l < i <P) are assumed to be zero-mean statistically 
independent circular Gaussian vectors such that E[ei e/ ( ] = o 1 <5^1#. 

Performance for P— 7 

We first consider the mono-source case (P= 1). Under the previous assump¬ 
tions, Figure [8/71 shows, for q = 1,2, 3 and for / = 1 (optimal arrangement), the 
variations of the RMS error of the source as a function of a, the RMS value of 
the modeling errors. We note similar performance of 2^-MUSIC methods as q 
increases, despite the increasing aperture of the HO VA. This can be explained 
by the fact that the HO virtual modeling errors—that is, the equivalent model¬ 
ing errors associated with the extended steering vector, a q j(0 \)—of the source 
increase jointly with the HO VA aperture. We note finally a decreasing precision 
of 2^-MUSIC methods as a increases. 



FIGURE 8.7 RMSE of the source as a function of a,q= 1,2, 3;N = 5; UCA; r/k = 0.5; P= 1; 
i.i.d. Gaussian and circular errors. Theoretical results. 
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Performance for P — 2 

We now consider the case of E = 2 sources, and we assume that o — 0.12 (strong 
errors), which corresponds, for example, to phase and amplitude errors with a 
standard deviation of 4.6° and 0.7 dB, respectively. Under these assumptions, 
Figure lOl shows, for q = 1,2, 3 and for / = 1 (optimal arrangement), the vari¬ 
ations of the RMS error of source 1 (it is similar for source 2) as a function 
of the modulus of the spatial correlation coefficient between the two sources, 
q? 12 , defined by Equation (18.271) with 0 = 0\ = 100° and 0 p = 0 2 . For q= 1, the 
RMSEs obtained from both a first-order (solid line) and an SO (dotted line) 
Taylor expansion of XY 2 ,n(E) around E = 0 have been computed. 

Figure EH shows, for q= 1, similar results obtained from both the first- 
order and SO Taylor expansion of Ii 2 ,n(E) around E = 0 as long as the spatial 
correlation of the sources is not too high. For high values of the spatial correlation, 
results obtained from the first-order expansion of Il 2 , w (E) no longer agree with 
the simulations, whereas those obtained from the SO expansion do. The latter 
show an increasing RMSE with \a \ 2 1 as long as the two sources are resolved by 
the 2-MUSIC algorithm—in other words, as long as |cyi 2 I <0.9, and there is a 
decreasing RMSE with |cyi 2 I—as soon as the two sources become unresolvable 
by the algorithm (\an I > 0-9). 

In all cases, the RMSE obtained from the first-order expansion of Ii 2 , n (E) 
is greater than or approximately equal to that obtained from an SO expansion of 
n 2 ,n(E). Moreover, for a given value of | a \ 2 | such that \a\ 2 | < 0.95, Figure EH 
shows a decreasing value of the RMSE, obtained from a first-order expansion 
of II 2 q , n (l)(E), as q increases. This illustrates the greater robustness of the 
2^-MUSIC algorithm to modeling errors as q increases, especially for a small 
angular separation between the sources. For |o?i 2 | >0.95, results obtained for 
q = 2,3 from the first-order expansion of H 2q , n (0 (E) are no longer in agreement 



FIGURE 8.8 RMSE of the source 1 as a function of \u\ 2 \,q = 1,2, 3; A = 5; UCA; r/k = 0.5; 
P = 2; i.i.d. Gaussian and circular errors, a = 0.12. Theoretical results. 
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FIGURE 8.9 RMSE of the source 1 as a function of \a\2\,q = 1,2;/ = 1,2;iV = 5; UCA; r/X = 
0.5; P = 2; i.i.d. Gaussian and circular errors, a = 0.12. Theoretical results. 


with the simulations and an SO expansion of II 2 q, n (l)(E) has to be taken into 
account. For a given value of q > 1, such an expansion would indicate an increas¬ 
ing RMSE with | o' 12 1 as long as the two sources are resolved by the 2g-MUSIC 
algorithm—that is, as long as |o'i 2 1 < \an\q, where \a 12 \q is the 2q-MUSIC’s 
resolution threshold and there is a decreasing RMSE with \an\ as soon as the 
two sources become unresolvable by the algorithm (\a \2 \ > I<^121^)* 

For high angular separation between sources (|o?i2l=0), the presence of 
modeling errors brakes the orthogonality of the sources, which then interact. 
This explains why, in this case, 4-MUSIC and 6-MUSIC perform better than 
2-MUSIC. All of our previous conclusions would have been similar from an 
SO expansion of Jl 2 q.n(l)(E), as was ve rified by simulations and as illustrated 
approximately in Chevalier et al. ( 20061) . Such illustrations would have shown 
that the maximal value of la 12 1 ensuring resolvability of the two sources increases 
with q , indicating the increasing resolution of 2^-MUSIC as q increases. These 
results may be physic ally interpreted through the HO VA concept introduced in 
Chevalier et al. (12005 ) and the results of Section l8.4.21 They definitely show the 


great interest of 2^-MUSIC methods for q> 2, even for overdetermined (P < A0 
mixtures of sources, despite the increasing variance and complexity with q. 

The influence of the arrangement of the statistics on the robustness to mod¬ 
eling errors of 2^-MUSIC is illustrated in Figure [82)1 which shows the same 
variations as in Figure ESI for the same scenario but for q = 1,2 and for / = 1,2. 
All the results have been obtained from a first-order expansion of Ii2 q ,n(0(E). 
We note that performance of 4-MUSIC for /= 1 and 1 = 2 is similar since 
the resolution of the associated VAs is independent of / (indeed \an(qJ)\ is 
independent of /). 


8.6 COMPUTER SIMULATIONS 

The performance of 2^-MUSIC and 2g-D-MUSIC methods implemented from 
a finite number of snapshots is illustrated in this section through computer 
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simulations. We first introduce performance criteria in Section 18.6.II and then 
describe the simulations in Sections 18.6.21 and 18.6.31 for both overdetermined 
OP < N) and underdetermined (P > N ) mixtures of sources, respectively. The 
sources are assumed to have a zero elevation angle cp and to be zero-mean 
stationary, corresponding to QPSK sources sampled at the symbol rate. 


8.6.1 Performance Criteria 

For each of the P considered sources and for a given direction-finding method, 
two criteria are used in the following to quantify the performance of the associated 
DOA estimation. For a given source, the first criterion is a probability of aberrant 
results generated by a given method for this source; the second is an averaged 
RMSE computed from the nonaberrant results generated by a given method for 
this source. 

More precisely, for given values of q and /, a given number of snapshots, 
L, and a particular realization of the L observation vectors x(k)(l <k<L ), the 
estimation, 0 p , of the DOA of source p( 1 <p<P ) from 2^-MUSIC or 2q-D- 
MUSIC is defined by 


0 p — Arg (Min \^ — 0 p \) 

6- * 


(8.53) 


in which the quantities f/(l < / < P) correspond to the P minima of the pseudo- 

/V [ /V 

spectrum P2q-Music(l)(@ ) defined by Equation (18.19b for cp = 0, or to the P 

/V /V /V 

absolute minima of the P pseudo-spectrum P2q-D-Music,i(i)(@)> 1 <i<P defined 


by Equation (18.21b for p — i and ip — 0. With each estimate 0 p (l <p <P), we 
associate the corresponding value of the pseudo-spectrum, defined by rj p = 


P2q-Music(D@ P ) °r rjp = P2q-D-Music,p-l(l)0p). In this context, the estimate 0 P is 
considered to be aberrant, because of an outlier for example, if rj p > 77 , where q 
is a threshold to be defined. 

The computation of the optimal threshold requires a deep theoretical analysis 
beyond the scope of this chapter. This analysis would show that this threshold is 
a function of several parameters such as the number of sensors, the number of 
sources, the power of the modeling errors, or the sensor array. In the following 
we arbitrarily choose 77 = 0 . 1 . 

We consider M realizations of the L observation vectors x(k) (1 <k<L). For a 
given method, the probability of aberrant results for a given source p,p(r\ p > 77 ), 


is defined by the ratio of the number of realizations for which 0 p is aberrant 
and the number of realizations M. From the nonaberrant realizations for the 
source p, we then define the averaged RMSE for source p , RMSE^, by the 
quantity 


RMSE,, = 

\ 


1 

W p 


M p 

^ ^ \@pm 
m= 1 



2 


(8.54) 
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where M p is the number of nonaberrant realizations for source p and 6 pm is the 
estimate of 6 p for the nonaberrant realization m. 


8.6.2 Overdetermined Mixtures of Sources 


To quantify the influence of both the number of independent snapshots L and the 
parameter q on the performance of 2g-MUSIC and 2g-D-MUSIC, we assume that 
two statistically independent QPSK sources with a raised cosine pulse-shaped 
filter are received by a ULA of N = 3 omnidirectional sensors spaced a half¬ 
wavelength apart. The two QPSK sources have the same symbol duration, the 
same roll-off, /x = 0.3, and the same input SNR equal to 5dB. Note that the 
normalized auto-cumulant of the QPSK symbols is equal to — 1 at the FO and 
+4 at the sixth order. 

Under these assumptions, Figures 18.101 and 18.111 show the variations, as a 
function of the number of snapshots, L, of the RMSE for source 1, RMSEi, and 
the associated probability of nonaberrant results, p(q i < r/) (we obtain similar 
results for source 2), at the output of the 2-MUSIC, 4-MUSIC, and 6-MUSIC 




(a) (b) 

FIGURE 8.1 0 (a) RMSE of the source 1 and (b) p(rn < ij) as a function of L. 1= 1, P = 2, N = 3, 

ULA, SNR = 5 dB, 6 \ = 90°, 62 = 97.5°, no modeling errors. 



FIGURE 8.11 (a) RMSE of the source 1 and (b) p(rj\ < p as a function of L. 1= l,P = 2,N = 3, 

ULA, SNR = 5 dB, 0\ = 90°, 62 = 97.5°, with modeling errors, a = 0.03. 
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methods for optimal arrangements of the considered statistics, without and with 
modeling errors, respectively. These performances are estimated from M = 300 
realizations and the minimum value of L for these figures is L = 100. For these 
figures, the sources, the DOAs of which are equal to 0\ =90° and 62 = 97.5°, 
respectively, are poorly angularly separated for the considered array of sen¬ 
sors, with a 3-dB beamwidth that is around 40°. In the presence of modeling 
errors, vectors et (1 < i < 2) are assumed to be zero-mean statistically independent 
circular Gaussian, such that Ele/ej^] =a 2 SfjIjv, where <7 = 0.03, which corre¬ 
sponds to phase and amplitude errors with standard deviations equal to 1.15° 
and 0.18 dB, respectively. 

In terms of both probability of nonaberrant results and estimation precision, 
Figures 18.101 and 18.111 show, for poorly angularly separated sources, the best 
behavior of the 6-MUSIC method with respect to 2-MUSIC and 4-MUSIC as 
soon as L becomes greater than 660 snapshots without modeling errors and 500 
snapshots with them. For such values of L, the resolution gain and the better 
robustness to modeling errors obtained with 6-MUSIC versus 2-MUSIC and 
4-MUSIC, respectively, due to the narrower 3-dB beamwidth of the associated 
sixth-order VA, are higher than the loss due to a greater variance in the statistics 
estimates. A similar analysis can be carried out for 4-MUSIC versus 2-MUSIC 
as soon as L becomes greater than 3500 snapshots without modeling errors and 
2000 snapshots with them. 

These results confirm that, despite their greater variance and contrary to some 
generally accepted ideas, HO MUSIC methods may offer better performance 
than 2-MUSIC or 4-MUSIC methods when some resolution is required—that 
is, in the presence of several poorly angularly separated sources with or without 
modeling errors, inherent in operational contexts. 

To complete these results, Figures 15.121 and [8.1 31 show the same variations 
as in Figure 18.101 and Figure 18.111 respectively, but for sources that are well 
angularly separated and the DOAs of which are equal to 0\ = 90° and 62 = 130°, 
respectively. Note in this case the best behavior of SO versus HO methods, due 
to a higher variance in the HO statistics estimates, when no particular resolution 
is required (i.e., in the absence of modeling errors—see Figure l8J~2l) . This result 
would also be valid in the presence of very weak modeling errors; however, 
we note a better performance of HO methods as soon as the sources interact 
with each other, meaning that some resolution is required. This is the case in the 
presence of higher modeling errors (Figure l8J~3l) . provided the variance in the 
HO statistics estimate becomes small (L > 1600). 

To illustrate the significance of the sequential algorithms, we consider again 
the scenarios ofFigure l8.101 but now 0\ = 90° and 62 = 95°. Under these assump¬ 
tions, Figure l8T4l shows the same variations as in Figure lOOl but at the output of 


2-MUSIC, 2-D-MUSIC, 4-MUSIC, 4-D-MUSIC, 6-MUSIC, and 6-D-MUSIC. 
Note, for a given value of q , the best behavior of the sequential algorithms 
(2g-D-MUSIC) with respect to the parallel (2^-MUSIC) methods when a strong 
resolution between sources is required. 
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FIGURE 8.1 2 (a) RMSE of the source 1 and (b) p(rj\ <rj) as a function of L. /= 1, P = 2, N = 3, 

ULA, SNR = 5 dB, 9\ = 90°, 02 = 130°, no modeling errors. 



(a) (b) 

FIGURE 8.1 3 (a) RMSE of the source 1 and (b) p{r\\ < rj) as a function of L. 1= 1, P = 2, N = 3, 

ULA, SNR = 5 dB, 0\ = 90°, 02 = 130°, with modeling errors, a = 0.03. 



FIGURE 8.14 (a) RMSE of the source 1 and (b) p(rj\ < rf) as a function of L. 1= l,P = 2,N = 3, 

ULA, SNR = 5 dB, 0\ = 90°, 02 = 95°, no modeling errors. 
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The influence of the HO statist ics arrangement on t he performance of 
2^-MUSIC methods was illustrated in lChevalier et al. ( 2006 ). which still showed 
increasing performance with q , whatever the choice of /, when some resolution 
is required. 


8.6.3 Underdetermined Mixtures of Sources 

To illustrate the influence of the arrangement of the HO statistics on the number 
of sources that can be processed by the 2g-MUSIC algorithm, we assume that six 
statistically independent QPSK sources with a raised cosine pulse-shaped filter 
are received by a UCA of N — 3 omnidirectional sensors with a radius r such that 
r = 0.3 A. The six QPSK sources have the same symbol duration, the same roll¬ 
off, fi — 0.3, the same input SNR equal to 15 dB, and a DOA equal to 0\ = 35°, 
0 2 = 80°, $3 = 110°, 64 = 140°, 65 = 230°, and 6 e = 304°, respectively. Under 
these assumptions, Figure 18.151 shows the variations in the pseudo-spectrum 
of 4-MUSIC, for / = 1 and 2, as a function of 0 for L = 10,000 without modeling 
errors; the good behavior of 4-MUSIC for / = 1, which succeeds in estimating 
the DOA of the six sources; and the poor behavior of 4-MUSIC for / = 2, which 
fails, since the number of VSs of the associated VA is equal to AfJ = 7 for / = 1 
and equal to N% = 6 for / = 2. 

To complete these results, Figure [8T6l shows the variations, as a function of 
the number of snapshots L, of the highest RMSE and the lowest probability of 
nonaberrant results among all sources, estimated from M = 300 realizations, at 
the output of the 4-MUSIC algorithm for / = 1 and without modeling errors. Note 
the increasing minimum probability of nonaberrant results and the decreasing 



FIGURE 8.15 Pseudo-spectrum of 4-MUSIC as a function of 6 : (a) 1 = 1; (b) 1 = 2, P = 6, 
N = 3, UCA, SNR = 15 dB, L=10,000, ft =35°,6> ? =80°,6^ = 110°, 0 4 = 140°, 6>s =230°, 
= 304°, no modeling errors ( Source: From Chevalier et al. . l200d © IEEE, with permission.) 






























(^324^) Chapter ] 8 High-Resolution DOA Estimation with Higher-Order Statistics 



(a) 



FIGURE 8.16 (a) Maximal RMSE and (b) minimum probability of nonaberrant results as a 


function of L: 1= 1, P = 6,N = 3, UCA, SNR=15dB, g t = 35°, 6> ? = 80° , 6>^ = 1 10°, 6 d = 140 


$5 = 230°, 6 § = 304°, no modeling errors. ( Source: From I Chevalier et al 
permission.) 


20061 © IEEE, with 


maximal RMSE as L increases, showing the capability of 4-MUSIC to efficiently 
estimate the DOA of all sources for / = 1. 

Other illustr ations of 2^-MUSIC fo r underdetermined mixtures of sources 
may be found in Chevalier et al. ( 2006b . particularly for q = 3. 


8.7 EXTENSION TO ARRAYS WITH DIVERSELY 
POLARIZED ANTENNAS: THE PD-2q-MUSIC 
METHODS 


This section presents extensions of the 2g-MUSIC algorith m to arrays with 
diversely polarized antennas. These extensions, introduced in 


Chevalier et al. 


(120071) . give rise to new algorithms called PD-2g-MUSIC. Some hypotheses 
are described in Section 18.7.11 and PD-2g-MUSIC algorithms are presented 
in Section [8.7.21 The issue of these algorithms identifiability is described in 
Section l8.7.3l Finally, performance of PD-2g-MUSIC algorithms are illustrated 
in Section [8. 7. 41 


8.7.1 Hypotheses and Notations 

The observation model is still assumed to be given by Equation (18.21) where, for 
an array with diversely polarized sensors and in the absence of coupling between 
them, the complex response, f n (0i, P t ), of sensor n for DOA 0/ and polarization 
Pi, introduced in Equation (18.31) . is potentially a function of n. In these conditions, 
the steering vector a(0i, p t ) actually depends on the parameters 0/ and /?,. 

Let fin and p i2 be two distinct polarizations for the ith source (for exam¬ 
ple, vertical and horizontal) and«i(0/) =a(0i, fin) and«2(0/) =«(0/, Pa) be the 
corresponding steering vectors for DOA 0/. We assume that the vectors «i(0) 
and a 2 (0) can be calculated analytically or measured by calibration whatever 
the value of 0. Considering an arbitrary polarization P t for the ith source, the 
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complex electric field of the latter can be broken down into the sum of two com¬ 
plex fields, each arriving fro m the same direction and having the polarizations 
Pa and P l2 JComptonlll988h . The steering vector a(Oi, pj) of source i is then the 
weighted sum of the steering vectors a\(0p and a 2 (0p given by 


a(0 i9 Pi) = Pnai(0i) + Paa 2 (0i) =A X2 (0 i )P i (8.55) 

In Equation (18.55b . A 19 (Oh is the (N x 2) matrix of vectors a 1 (Op and a 2 (0p, 
whereas fin and pa are complex numbers such that \Pn\ 2 + \Pa\ 2 = 1- Vector P t 
is the unit norm ( 2 x 1 ) vector with components pn and Pa- It can be written, to 
within a phase term, as = [cos yi, e J ^ sin y/] T , where yi and 0 / are two angles 
characterizing the polarization of source i and such that (0 < y\ < 7 t/2 , —n < 
(pi < jt ). Note that for an array with identical sensors, a\(0p,a 2 (0p and a(0i , pp 
are collinear, which means that, to within a constant, a(0i, pp does not depend 
on the polarization of source i. 


8.7.2 The PD-2q-MUSIC Algorithms 

Two kinds of algorithms are defined in the following subsections, depending on 
whether the polarization of the sources is a priori known or unknown. 


Sources with Known Polarization: KP-PD-2q-MUSIC 

Assuming that the hypotheses of Section 18.3.11 are verified, and following the 
same reasoning as in Section 18.31 we deduce that the DOAs of the sources are 
solutions to Equation (18.181) . which can be written, for an array with diversely 
polarized sensors, as 


a q ,i(0, P) f n 2q , n (l)a q ,i(0, P)= 0 


(8.56) 


where a q y(0, P) is defined by Equation (18.111) with a(0, P) instead of a(0). Equa¬ 
tion (18.561) corresponds to the heart of the PD-2g-MUSIC algorithms for the 
arrangement / and can also be written, using Equations (18.551) and (18.111) . as 


P q y Ai 2 ^ q j{eyn 2 q ^ n {i)Ai 2 ^ q j{e)p ql =t) 




(8.57) 


where P q i andAi 2 ,^,/( 0 ) are, respectively, the (2 q x 1 ) vector and the (N q x 2 q ) 
matrix defined by 


p qJ =[p^®p* m - l) ] 


(8.58) 


Ai 2> ,,/(tf)^[Ai2(tf)® , ®A 12 (tf)*® ( «- ,) ] 


(8.59) 


For some values of (q, /), some components of p q i are equal. It is then useful 
not only to at least reduce the complexity of the computation of the left side 
of Equation (18.571) but also, for improving the performance of the algorithms 
presented in the next section, to remove the redundant components of p q i . 
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This can be done by removing the redundant components of both P ® 1 and 
j s straightforward to show that ft ® 1 can be written as 

P® l =Bi~Pt (8.60) 

where /?/ is the ( 2 l x (/ + 1)) real matrix such that 


B i=h 

Bi+ 1 = [B/0I 2 ]{[I/+i 0 ^i][I/+i 0 /+i] 

+ [I/+i ®c 2 ][0/+il/+i]}, (/ > 1) 


(8.61a) 

(8.61b) 


In both parts of Equation (18.611) . I r is the (r x r) identity matrix; c\ and C 2 are the 
(2 x 1) vectors defined by c\ = [1 0] T and C 2 = [0 1] T ; 0/+i is the ((/ + 1) x 1) 
zero vector; and pj is the ((/+l)xl)) vector with components pj[j] defined by 

hU]=Pi H+1 P2 j ~ U , 1 < 7</+1 ( 8 - 62 ) 


where P\ and ft 2 are the components of the polarization vector p. From Equations 
(18.581) and (18.601) . we deduce that 


fiq,l = (Blfi,) <E> (B q - l p q _ l )* 

= [B, [ft ® £,_,*] = [BiQBq-t ]ft qJ 


(8.63) 


sj/ 

where p q j = [pj (g) p q _i ] is an ((/ + \ )(q — /+ 1) x 1) vector. 

Note a dimension reduction of p q j with respect to P q j for most values of 
(q, /). To ensure, in the absence of sources (i.e., when H2q,n(l) — I N q ), a constant 
value independent of parameters 0 and p of the left side of Equation (18.571) . it is 
necessary to normalize the latter by the quantity p q j ' A\ 2 A p( 0 )^A\ 2 , q p{ 0 ) p qi . 

Using Equation (18.631) in Equation (18.57b . the problem of source DOA esti¬ 
mation by the PD-2g-MUSIC algorithm for the arrangement / then consists of 

A A /V /V 

finding the P sets of parameters (0;, P t ) = (0j, cpi, fa, 0;), (1 < i < P), which either 
are solutions to Equation (18.64b or minimize its left side, where Equation (18.64b 
is defined by 


Pq,l Qq,l,l(^Pq,l 

Pq,l QqXl(Q)~Pq,l 


(8.64) 


where the ((/+ 1 )(q — l+ 1) x (/+ 1 )(q — l + 1)) matrices Q q j^( 0 ) and Q q ^ 2 (0) 
are defined by 


Qq,t , 2 («) = \B, ®B q -tfA n ,q,i( 0 pAn, q ,i (*)[«/ (8 66) 


For sources with known polarization, the set of parameters for a given source 
reduces to the set of its DOA parameters, and the complexity of PD-2g-MUSIC, 
called in this case KP-PD-2g-MUSIC (Known Polarization PD-2g-MUSIC), 
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corresponds to that of 2g-MUSIC. However, for sources with unknown polariza¬ 
tion, the set of parameters for a given source has to take into account polarization 
parameters in addition to DO A parameters, causing the complexity of the 
searching procedure of PD-2g-MUSIC to dramatically increase beyond wha t 


is generally and practically reasonable. For this reason, IChevalier et al 


20071) 


proposed using the KP-PD-2g-MUSIC algorithm only when the polarization of 
sources is known. 

In practical situations, matrices H2q,nG) and U2q,n(f) must be estimated 
from the observations; assuming sources with known polarization, the sources’ 
DOA may be found by searching for the minima of the estimated left side of 
Equation (18.641) . The different steps of the KP-PD-2g-MUSIC algorithm for the 
arrangement / are summarized as follows: 


Step 1. Estimation, C2 q , x (l), of the matrix C2q,x(f) from L snapshots x(k ), 
1 < k < L, using a suitable estimator of the 2gth-order cumulants of observations. 

/V 

Step 2. Eigendecomposition of the matrix, C 2 q , x (0, and extraction of an esti- 

mate, U2 q ,n(J), of the U 2 q ,nG) matrix. This step may involve rank determination 
in cases where the number of sources and/or their mutual statistical dependence 
are not known a priori. 

Step 3. Computation, for each known vector /* = &(! <i<P), of the 
estimated pseudo-spectrum, 


P KP-PD-2 q -Music(l ) (0,P) 


A 


Pq,l Q q ,l,l(0)P q J 

P q ,i Q q ,i,i(Q)~P q ,i 


(8.67) 


over a suitably chosen grid. Then search for the local minima (including interpo¬ 
lation at each local minimum), where the ((/+ 1)(# — / +1) x 

/V 

matrix Q q /?1 (0) is defined by 


e 9 ,uW = [B/®-B ? _ ; ] t A 12 , <? ,/W T n 29 ,„(/Mi2, 9 ,/W[B/®B 9 _ ; ] (8.68) 




In Equation (18.681) . fl2q,n(l) — U2q,n(l)U2q,n(tf• Note that, similar to 
2g-MUSIC, PD-2<7-MUSIC cannot handle perfectly coherent sources. 


Sources with Unknown Polarization: UP-PD-2q-MUSIC 

For sources with unknown polarization, the complexity of the PD-2g-MUSIC 
searching procedure, described in the previous section, with respect to DOA 
and polarization parameters is dramatically high. A simple way to remove the 
searching procedure for the polarization parameter consists, for any fixed DOA, 
of minimizing t he left side of Equation (18.641) with respect to this parameter, as 


was proposed in lFerrara and Parksl (119831) for q = 1. This gives rise to UP-PD-2g- 


MUSIC (Unknown Polarization PD-2g-MUSIC), of which the pseudo-spectrum, 
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for the arrangement indexed by /, is given by 


t 


d A I Pc P l Qq,l,l$)Pq,l 

*UP-PD-2q—Music(l)\u) = £ 1 -X- 

P * 1 [hi Q q ,iaWhi 


(8.69) 


It is well known (IFerrara and Parksl.119831) that the right side of Equation (18.691) 
corresponds to the minimum eigenvalue, k q j m i n (0), of the ((/ + l)(q — /+ 1) x 
(/ + 1) (g — / +1)) matrix Q q p\ (0) in the metric Q q j^ (0)»and that the minimizing 

vector denoted ^ ^ min (0), corresponds to the associated eigenvector. In 

other words, k qJMn (0) and P q j Mn (0) satisfy 

Qq,l, 1 ^)^,/,min (0) — kq,l, min (8-70) 

Thus, a first version of UP-PD-2g-MUSIC for the arrangement indexed by /, 
called UP-PD-2g-MUSIC(/)-l, consists of finding the P sets of parameters 0; = 

/V 

(0,-,&),(1 <i<P ), for which the pseudo-spectrum 

P UP-PD-2q-Music(l)-\ (0) — kqj, min (0) (8.71) 

is zero. 

This algorithm corresponds to a 2gth- order extension, for the arrangement 
indexed by /, of the algorithm propo sed in IFerrara and Parksl dl983h for q = 1. It 
was shown in lChevalier et al.l d2QQ7l) that it thus becomes possible to estimate the 


polarization of each source i from the associated eigenvector P q j, m [ n (0/), which 

is the solution to Equation (18.701) for 0 = 0/. Note that one way in which the 
eigenvalue k q i m { n ( 0 ) can be computed is by determining the minimum root of 

dettg^/qW — kQ qja (0)]=0 (8.72) 

where det[X] denotes the determinant of X. Thus, for each value of 0, search¬ 
ing in polarization space has been avoided by finding the roots of an equation 
of order (/+ l)(g — /+1), which corresponds to a substantial reduction in com- 
putat 
from 


on, at least for small valu es of q. We deduce f rom Equation (18.721) and 
Chevalier et al.l ( 2007 ) and Ferreol et al. ( 2004a ) that for invertible matrix 


Qq,l, 2 W’ finding 0 such that X = k q i m - m (0) is zero is equivalent to finding 0 such 

that det[Q qX2 (0)~'Q qX1 (0)] = det[Q qJl (0)]/det[g, ji>2 (0)] = 0. 

A second version of the UP-PD-2g-MUSIC algorithm for the arrangement 
indexed by /, called UP-PD-2g-MUSIC(/)-2, consists of finding the P sets of 

A /V 

parameters 0/ = (0/, cpi), (1 <i<P), for which the pseudo-spectrum 


p m - dc ' iQ "-'- |<<i)i 

P UP-PD-2q-Music(l)-2\y) = 


(8.73) 


det [Q qX2 m 

is zero, which allows a decrease in complexity with respect to the computation 
of Equation (18.711) . 

In practical situations, matrix U 2 q ,nG ) h as 1° be estimated from observa¬ 
tions. Assuming sources with unknown polarization, their DOA may be found 
by searching for the minima of the right side of Equation (18.711) or (18.731) . The 
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steps of the two versions of the UP-PD-2g-MUSIC algorithm for the arrangement 
/ are summarized as follows: 


Step 1. Estimation, C 2q,x(0> of the matrix C 2q,x(f) from L snapshots x(k ), 
1 < k < L, using a suitable estimator of the 2gth-order cumulants of observations. 


Step 2. Eigendecomposition of the matrix, C 2 q , x (0, and extraction of an esti¬ 


mate, U2q, n (J)> °f the U 2q,n(J) matrix. This step may involve rank determination 
in cases where the number of sources and/or their mutual statistical dependence 
are not known a priori. 


Step 3. Computation of matrices and one of the two 

estimated pseudo-spectrums, 



over a suitably chosen grid. Then a search for the local minima of 
Pup-PD-2q-Music(D-\(0) or Pup-PD-2q-Music(i)-2(0 ) (including interpolation at each 

/V /'V 

local minimum), where A^/ ?m i n (0) is the minimum eigenvalue of Q q ^ \ ( 0 ) in the 
metric Q qJ 2 (0). 

Step 4. If needed, computation of both the associated estimated vectors 


Pq,i ,min WO an d the polarization vector of the sources. 


8.7.3 Identifiability 

As for arrays with identical sensors, we assume an absence of coupling between 
the sensors and we limit the identifiability analysis of PD-2q-MUSIC algorithms 
to the case of statistically independent sources. 

Higher-Order Virtual Array 

As in Section 18.4,11 we deduce that, for the 2gth-order array-processing meth¬ 
ods exploiting Equation (18.101 ) with vectors a q j(0i, instead of a q j(fii ), 
the (N q x 1) vector a q j(0i, can be considered as the true steering vec¬ 
tor of source i for the VA of N q VSs, with coordinates, (x[ ]k2 k ,y[,£ 2 k ’ 

z l k k 2 k )’ d e fi ne d by Equation (18.241) and complex amplitude patterns, 

fU 2 . k q (0 ’ /*)• 1 <*j<N f or 1 </< q, given by 


/ q—l 



(8.74) 


This introduces in a very simple way, and for arrays with diversely polarized 
sensors, the VA concept for the 2g/E-order DOA estimation problem for the 
arrangement indexed by /. 
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KP-PD-2q-MUSlC 

The KP-PD-2g-MUSIC algorithm for the arrangement indexed by / can estimate 
the DOA of P statistically independent sources from an array of N sensors pro¬ 
vided that hypotheses HI" and H2" are verified, with a q j(0i, fo) instead of 
a q j(0i), and that the DOA and the polarization of the sources are the only solu¬ 
tions to Equation (18.561) . Noting N l 2cj {N l lq < N q ) as the number of different VSs of 
the associated VA, and following a reasoning similar to that for identical sensors, 
we deduce that the KP-PD-2g-MUSIC algorithm for the arrangement indexed 
by / is able to process up to 


^max=4,*=<-l 


(8.75) 


sources, provided the associated 2gth-order VA has no ambiguities up to order 
N l lq — 1. The rth-orde r ambiguities of HO VA are an important problem that was 


partially discussed in IChevalier et al.l (120071) . where it was found, in particular, 
that, for a VA without colocalized sensors, P max is effectively given by Equa¬ 
tion (18.751) . whereas for a VA with colocalized sensors, P max is generally lower 
than Equation (18.75 ). 

For given values of V, q , and / and for g eneral arrays of N se nsors 
with diversely polarized sensors, it was shown in IChevalier et al.l (120051) that 
N l lq is necessarily upper-bounded by a quantity, noted N m2iX [2q,l ], such that 
Af m ax[2 q, l]<N q . Table 18.21 shows, for a general array with diversely polarized 
sensors, the expression of N mSiX [2q, l ] as a function of N for 2 < q < 4 and several 
values of /. This upper bound corresponds to N l lq in most cases of sensor responses 
and array geometry. 


TABLE 8.2 /V ma x[2q / /] as a Function of N for Several Values 
of q and / and for Arrays with Diversely Polarized Sensors 

m = 2q 

1 

A/max[2<7, /] 

4 

2 

N(N +1 )/2 

(q = 2) 

1 

N 2 

6 

3 

N!/[6(N —3)!] + N(N —1) + N 

(q = 3) 

2 

N!/[2 (N — 3)!] + 2N(N - 1) + N 

8 

4 

N!/[24(N-4)!] + N!/[2(N-3)!] + 1.5N(N-1) + N 

(q = 4) 

3 

N!/[6(N-4)!] + 1.5N!/(N-3)! + 3N(N-1) + N 


2 

N!/[4(N-4)!] + 2N!/(N-3)! + 3.5N(N-1) + N 

Source: From\Chevalier et all\200& © IEEE. witEi oermission. 
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UP-PD-2q-MUSlC 


The developments of the previous section are still valid for UP-PD-2g-MUSIC 
algorithms. In particular, the maximal number of statistically independent 
sources that may be processed by these algorithms for the arrangement indexed 
by / cannot exceed N 2q — 1 if the associated VA has no ambiguities up to order 

N 2q — 1. Moreover, we deduce from the HO VA theory (IChevalier et all 120051) 


that (N q —N l lq ) components of a q j(0, ft) produce no information. This means 
that a q j ( 0 , ft) can be written as 


a q j(0,p)=Ga q j, nr (0,p) 


(8.76) 


G is a full-rank and constant (N q x N 2q ) matrix, and a q j^ nr (0, ft) is the nonredun- 

dant {N[ q x 1) steering vector of a source coming from DO A 0 with polarization 
P for the VA associated with parameters ( q , /). Equation (18.761) shows that an 
arbitrary steering vector a q j( 0 , ft) necessarily belongs to the space spanned by 
the N 2q columns of G, noted Span{G}. Regarding the G^ a (N q x (N q —N l 2q )) 
full-rank matrix with columns that span the space orthogonal to SpanjG}, 
we deduce that all the vectors, u = G ± v , of ImjG^}, where v is an arbitrary 
(( 'N q —N l lq ) x l) vector, are orthogonal to a q j(0, ft) for arbitrary values of ( 0 , /?). 
A direct consequence of this result is that whatever the number, P, of statisti¬ 
cally independent sources such that P <N 2q — 1, and whatever their DOA and 

polarization, ImfG^} C SpanfL^,^/)}, which means that (N q —N 2q ) columns 

of U 2 q ,n(l) are not discriminant. In other words, we deduce from Equation (18.76 ) 
and the previous results that only (N 2a — P ) columns of U2 q , n (0 are discriminant, 
while Equation (18.561) takes the form 


a q ,i,nr(0, p) t G l n 2q ,n(l)Ga q j^r(0, p) =a q j(0 , pyn 2q ,n,G( l )a q ,i(0, P)=0 


t- 


(8.77) 


where Tl 2 q , n ,G(l) is the (N q x (N 2 q —P)) orthogonal projector on span{U 2 qqi (l)} 
HSpan{G}. 

Replacing Equation (18.561) with Equation (18.771) in the developments of Sec¬ 
tion (877221 we deduce that, for given values of (q.l). k q i ( 0 ) and P qJMn ( 0 ) 
are also solutions to Equation (18.701) where O n / i (0) has been replaced with 
defined by 


Q q JXG^) = \Bl®B q ^A n , q M^ 2 q ,n,Gmn, q ,imBi®B q -i] (8.78) 




Because the quantity X q j m [ n (0), defined by Equation (18.70b with <2 ( ,/. i 
instead of Q q ^ j ( 0 ), has to be nulled only for the DOA of the sources and not for 
other DOAs, the ((/+ l)(q — /+ 1) x (/ + l)(q — /+ 1)) matrix has 

to be full rank when 0 does not correspond to a source’s DOA. Using Equa¬ 
tion (18.781) . this means that the rank of Ti 2 q , n ,G(l) cannot be lower than (/+ 1) 
(q — l+ 1). Thus, the number of columns of Ti 2 q , n ,G(0 must be greater than or 
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equal to (/ + l)(q — l + 1). Moreover, in the presence of P statistically indepen¬ 
dent sources such that P < N l lq — 1, the number of columns of II 2 q , n ,G(0 is equal 

to ~ P f° r th e associated VA with no ambiguities up to order N 2 — 1. As a 
consequence, the maximal number of sources, P max , that may be processed by 
UP-PD-2g-MUSIC algorithms for the arrangement indexed by / must, for such 
VAs, verify P max <N l 2q -(l+\)(q-l+\). 

Conversely, for a 2c/lh-ordcr VA with no ambiguities up to order N 0 — 1, the P 
sources coming from P different directions with different polarizations and such 
that P < N 2 — (/ + 1) (q — l + 1), are such that their DO As are the only solutions to 
^q,l, min(0) = O. From the previous results, assuming a 2gth-order VA for the 
arrangement indexed by / with N l 2q different VSs and with no ambiguities up to 

order Nl> — 1, we deduce that UP-PD-2g-MUSIC algorithms for the arrangement 
/ are able to process up to 


max 


=pku= N 2 q -«+m-i+ 1 ) 


(8.79) 


sources. This is strictly lower than Equation (18.75b . and giv es P max = N — 2 for 
c/= I and arrays with scalar sensors—a result obtained in iFerrara and Parks 
(119831) . Note that for VA wi th HO ambiguities, P m ax is lower than Equation 


(18.79b (IChevalier et all 120071) 


8.7.4 Computer Simulations 

Performance of PD-2g-MUSIC algorithms are illustrated in this section through 
computer simulations for both overdetermined and underdetermined mixtures 
of sources. 

Overdetermined Mixtures of Sources 

To show the interest of taking into account both the polarization of the sources 
and the HO statistics for direction finding, we consider a UCA of N = 6 crossed 
dipoles with a radius r such that r = 0.3 A. One dipole is parallel to the v-axis; 
the other is parallel to the z-axis. Three of these crossed dipoles are combined to 
generate, in the y-axis, a right-sense circular polarization while the other three 
are combined to generate, in the y-axis, a left-sense circular polarization. The 
array is thus composed of two orthogonally polarized overlapped (noncolocated) 
circular subarrays of M = 3 sensors so that adjacent sensors always have different 
polarizations, as depicted in Figure [8j2ta). 

Under these assumptions, the sensors of the first and second subarrays 
have a complex response to a unit electric field coming from DOA 0 with 
polarizations p equal to/(0, p) =g\(0, p) +jg2(0, P) and f(0, p) =g\(0, p) - 
jg2(0,P), respectively. In these expressions, g\(0,P) and g2(0,P), which 
correspond to the complex responses of the two dipoles, are given by 
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(a) (b) 

FIG U RE 8.17 Circular array of six equi-spaced sensors composed of two overlapped orthogonally 
polar ized subarrays of three sensors: (a) non-collocated subarrays; (b) collocated subarrays. ( Source: 


From lChevalier et all120071 © IEEE, with permission.) 


Chevalier and Ferreoll (119991) and lComptonl (119881) : 


gi (0, P) = gi (6, (p, y, 0) = sin y sin cp cos 0 e J< ^ — cos y sin 6 
g 2 (0,p)=g 2 (0,(p,y,(p) = - sin y cos<pe j0 


(8.80) 

(8.81) 


In other words, the complex response, f n (0, /?), of sensor n( 1 < n < N) to a unit 
electric field coming from DOA 0 with polarization P is given by 

f n (0, p) = (sin 99 cos 0 +j(— \) n cos 99 ) sin ye^ — cosy sin# (8.82) 


We then deduce from Equations d8.31) . (18.551) . and (18.82b that, in this case, the 
2N coefficients of matrix A\ 2 (0) are defined by 


An(0)[n, 1] = —sin0e^” 


(8.83a) 


A 12 (0)\n, 2] = (sin<p cos(9+j(—l) 7 ? cos<p)e ^ 77 (8.83b) 

where 1 < n < N and where f„(l < n < N) is defined by 

t, n = 2Tt[x n cos(0)cos((p) +y n sin(0)cos(<p) +z w sin(^)]/A (8.84) 

Note that this corresponds to choosing vectors a\ (0) =a(0, Pi) and a 2 (0) = 
a(0, p 2 ) such that Pi = [1, 0] T and P 2 = [0, 1] T . Note also that the chosen array 
of sensors presents ambiguities for 0 O = ( 0 , kn), where k is an integer, and thus 
prevents estimating the DOA of sources coming from 0o- Indeed, a\(0o) = 0, 
Qc/j, 1 (0o)> and Q q j, 2 (^ 0 ) are not full rank and 0 q is always a solution to Equations 
I8.57b and d8.70t and (6» 0 ) = 0. 

In this context, two QPSK sources with the same symbol duration, the same 
raised cosine pulse-shaped filter with a roll-off of 0.3, and the same input 
SNR (which would be received by an omnidirectional sensor) equal to 5 dB 
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FIGURE 8.18 (a) RMSE and (b) probability of nonaberrant results of source 2 as a function of L. 

P = 2,N = 6, UCA, SNR = 5 dB, (6>i yi, ) = (50°, 45°, 0°), (6>?, y?, <h) = (60°, 45°, 10°), no modeling 

errors. ( Source: From lChevalier et all120071 © IEEE, with permission.) 



(a) (b) 

FIGURE 8.19 (a) RMSE and (b) probability of nonaberrant results of source 2 as a function of L. 

p = 2, N = 6, UCA, SNR = 5 dB (6>i, yi, <f>\ ) = (50°, 45 °, 0°), (<9 2 , y 2 ,02) = (60°, 45' 3 ,10°), with modeling 
errors a = 0.017. ( Source: From lChevalier et al.U2Q07[ © IEEE, with permission.) 


are assumed to be received by the array. They are first assumed to be 
weakly separated in both space and polarization and such that (0i,yi,0i) = 
(50°,45°,0°) and (02, Y2, 4>i) = (60°, 45°, 10°), respectively. Under these 
assumptions, Figures 18.181 and 1091 show, as a function of the number of snap¬ 
shots L, the variations of the RMSE for source 2, RMSE 2 , and the associated prob¬ 
ability of nonaberrant results, for several methods with and without modeling 
errors, respectively. For arrangement of the considered statistics indexed 
by 1=1, the analyzed methods correspond to 2-MUSIC, KP-PD-2-MUSIC, 
UP-PD-2-MUSIC-1, UP-PD-2-MUSIC-2, 4-MUSIC, KP-PD-4-MUSIC, UP- 
PD-4-MUSIC-1, UP-PD-4-MUSIC-2, 6-MUSIC, KP-PD-6-MUSIC, UP-PD- 
6 -MUSIC-l, and UP-PD-6-MUSIC-2. Performance is computed from 300 
realizations, and similar results are obtained for source 1. 
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In the presence of modeling errors, the steering vector of source i at the output 
of the sensors becomes 


a(0(, Pi)=ix(6i, PMaiPi, Pi) +<?,] 


where a (Of, /?,-) is the normalized steering vector a( 0 i, p f ) such that aifii, p t ) = 
in(0i, p^aiOi, /?,), with a(0i, aifii, p t ) =N, and the vectors e;(l < i < 2) are 
assumed to be zero-mean statistically independent circular Gaussian vectors such 

■j" r\ r\ 

that E [etfj ] = a 8 For the simulations, a = 0.0003 (weak modeling errors), 
which corresponds, for example, to a phase error with a standard deviation of 
0.6° jointly having an amplitude error with a standard deviation of 0.1 dB. For 
2-MUSIC, 4-MUSIC, and 6-MUSIC, the six sensors of the UCA are assumed 
to be identical with complex responses f n fi, ft) =f(fi, ft) = gi( 0 ,p)+)g 2 fi,p) 
(l<n<N). 

Figures 18.181 and 18.191 show, for sources that are weakly separated both in 
DO A and polarization, with and without modeling errors, and for a given value 
of q(= 1,2, 3), the best behavior of DP-2g-MUSIC versus 2g-MUSIC as soon 
as the polarization of sources is different. This shows the better resolution and 
greater robustness to modeling errors, whatever the value of q , of methods that 
discriminate in both space and polarization over methods that discriminate the 
sources only in space. Moreover, we note, for a given value of q, the similar 
performance of UP-PD-2g-MUSIC-l and UP-PD-2g-MUSIC-2, which seem to 
differ only in complexity. 

We note as well increasing performances with q of UP-PD-2g-MUSIC for 
situations where resolution is required. This is due to an increasing resolution 
in both DOA and polarization of the associated 2gth-order VA and shows the 
interest of exploiting both polarization diversity and HO statistics for direction 
finding with poorly separated sources. 

For a given value of q, the better performance of KP-PD-2g-MUSIC over UP- 
PD-2g-MUSIC, due to the exploitation of true a priori knowledge of the source 
polarization, is clear. Also clear, for two sources with known polarizations, is 
the improving performance of KP-PD-2g-MUSIC as q increases in the presence 
of modeling errors as soon as the number of snapshots grows beyond 1300. 
This result seems to be directly related to the degree of coupling of the two 
estimated pseudo-spectra (one for each polarization), computed by a given KP- 
PD-2g-MUSIC method, which increases with modeling errors and when the 
polarization separation of the sources decreases. 

More precisely, when this coupling is high (weak polarization separation 
with modeling errors), the two sources interact in each of the two computed 
pseudo-spectra and resolution is required to separate them—thus, the improved 
performance with q of KP-PD-2g-MUSIC. However, when this coupling is 
weak (strong polarization separation or absence of modeling errors), sources 
no longer interact in a given pseudo-spectrum. Then only one source has to 
be found for a given pseudo-spectrum and no resolution is required, leading 
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FIGURE 8.20 (a) RMSE and (b) probability of nonaberrant results of source 2 as a function of L. 

P = 2, N = 6, UCA, S NR = 5 dB, (6>i yi,(fri ) = (50°, 45°, 0°), (9 2 , y 2 , <fo) = (60°, 45°, 180°), no modeling 
errors. ( Source : From lChevalier et alll2007L © IEEE, with permission.) 


(a) 




(b) 


FIGURE 8.21 (a) RMSE and (b) probability of nonaberrant results of source 2 as a function of L. P = 2, 

N = 6, UCA, SNR = 5 dB, ( 9 1 , yi , ) = (50° 45°, 0 °), (0?, y ? , <h) = (60°,45°, 180°), with modeling 
errors, cr = 0.017. ( Source : From lChevalier et all 120071 © IEEE, with permission.) 


to lower performance due to a higher variance in the statistics estimation 
as q increases. 

We now consider the scenario of Figures 18.181 and l8T9l but we assume that 
the two sources are still poorly angularly separated but well separated in polar¬ 
ization, such that (0i, yi,<pi) = (50°, 45°, 0°) and (0 2 , y 2 , <fi 2 ) = (60°, 45°, 180°), 
respectively. Under these assumptions, Figures 18.201 and 18.211 show variations 
similar to those in Figures 18.181 and 18.191 Whatever the value of q, we still 
see better performances with PD-2g-MUSIC methods over 2^-MUSIC methods 
because of polarization discrimination in addition to space discrimination. And 
we remark the very close performances of UP-PD-2g-MUSIC-l and UP-PD- 
2g-MUSIC-2, as well as the decreasing performance of UP-PD-2g-MUSIC-l,2 
and KP-PD-2g-MUSIC, as q increases. This is due to a higher variance in 
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the statistics estimates given that no resolution is required for direction find¬ 
ing because of a high separation of sources in polarization. Finally, we note 
that, for q > 2, KP-PD-2g-MUSIC may perform surprisingly less well than 
UP-PD-2g-MUSIC. 


Underdetermined Mixtures of Sources 

To illustrate the capability of PD-4-MUSIC and PD- 6 -MUSIC to process under¬ 
determined mixtures of sources, we limit the number of sensors in the previ¬ 
ous circular array to N = 3. These sensors are such that fi(0, ft) =/3(0, P) = 
St(M)+jS 2 (M) and/ 2 (M)=£i(M)-j* 2 (M) for PD-2^-MUSIC and 
M0,p)=f 2 (0,P)=M0,p)= gl (0,p)+jg2(0,P) for 24 -MUSIC. We also 
assume that / = 1. Und er these assumptions, we deduce from Tables 1 and 2 in 
Chevalier et al.l d2005 ). and Equations (18.751) and (18.79b that (AfJ ,P\ K ,P\ u ) = 


(8,7,4) and (n\, p\ k , p\ v ) = (15,14,9) for PD- 24 -MUSIC methods; we 
deduce from Tables 6 and 7 in IChevalier et al. ( 2005 ) that (Afj, P max ) = (7, 6 ) 
and {N \, P max ) = (12,11) for 24 -MUSIC. 

We then assume that four statistically independent QPSK sources with a 
raised cosine pulse-shaped filter are received by the array. These sources have 
the same symbol duration, the same roll-off /x = 0.3, the same input SNR 
equal to 15dB and DOA, and polarization parameters equal to (0i,yi,0i) = 
(15°, 45°, -75°), (02,72,02) = (45°,45°,0°), (0 3 , 73,03) = (95°, 22.5°, 75°), 
and ( 04 , y 4 , 04) = (122.5°, 45°, 150°), respectively. 

Under these assumptions, Figure 18.221 shows the variations, as a function 
of the number of snapshots L, of the highest RMSE and the lowest proba¬ 
bility of nonaberrant results, among all the sources, at the output of several 
methods without modeling errors. These methods correspond to 4-MUSIC, 
KP-PD-4-MUSIC, UP-PD-4-MUSIC-1, UP-PD-4-MUSIC-2, 6 -MUSIC, KP- 



Samples Samples 


(a) (b) 

FIGURE 8.22 (a) Maximal RMSE and (b) minimal probability of nonaberrant results of sources as a 

function ofL. P = 4, N = 3, UCA, SNR= 15 dB, ( 6 >i, y x , fa) = (15°, 45°, -75°), (6> 2 ,K2,0 2 ) = (45°, 45°, 0°), 
(fo, ys, 0s) — (95°, 22 .5°, 75°), ( 64 , 5 / 4 , $ 4 ) = (122.5°, 45°, 150°), 150°), no modeling errors. ( Source: From 


Chevalier et all120071 © IEEE, with permission.) 
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PD-6-MUSIC, UP-PD-6-MUSIC-1, and UP-PD-6-MUSIC-2, respectively, 
with performances computed from 300 realizations. Note the capability of 
PD-2g-MUSIC to process underdetermined mixtures of sources provided that 
P < Pmax given by Equation (18.751) or (18.791) . Note also the poor performance 
of 2g-MUSIC for the considered scenario due to the low input power of the 
weakest source at the sensor outputs. Better performance would be obtained for 
higher values of L. 


8.8 CONCLUSION 

In this chapter, the philosophy, properties, implementations, both parallel and 
sequential, and performance of HO HR DO A estimation methods were presented 
through description of the 2gMUSIC and PD-2g-MUSIC algorithms families 
(q> 1). These are well suited for any kind of array with identical and diversely 
polarized sensors, respectively, for several arrangements of the 2gth-order data 
statistics. 

It was shown that HO cumulant-based HR DOA estimation methods have 
many advantages over SO methods. They are asymptotically robust with respect 
to a Gaussian background noise with spatial coherence that is unknown, and, 
because of a virtual increase in both the effective aperture and the number of 
array sensors, they allow for increasing resolution, increasing robustness to mod¬ 
eling errors in multiple-source contexts, and a greater number of sources to be 
processed as q increases. 

For a given value of q, the maximal number of sources that may be processed 
by an HO cumulant-based method is directly related to both the considered 
array of sensors and the way the 2gth-order data statistics are arranged in the 
exploited 2gth-order statistical matrix, illustrating the existence of an optimal 
arrangement. In particular, the results may allow minimization of the number 
of sensors and reception chains for a given number of sources to be processed 
and may allow relaxation of some constraints on antenna calibration or receiver 
chain equalization. 

The main drawbacks of HO cumulant-based HR DOA versus SO methods are 
their numerical complexity and their higher sensitivity to finite-sample effect, 
especially for weak sources and when no resolution is required to resolve the 
sources. Numerical complexity may be decreased by implementing contracted 
methods, which exploit only a part of the HO information and are intermediate 
between SO and HO methods. However, this comes at the expense of the number 
of sources to be processed. Sensitivity to finite-sample effect may be decreased by 
exploiting both SO and HO data statistics through implementation of a direction¬ 
finding process from a blind estimation of source steering vectors, but this lessens 
both robustness with respect to an unknown colored Gaussian noise and the 
number of sources to be processed. 
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the Decotes research project (ANR06-BLAN-0074). 








References 


REFERENCES 

Albera, L., Ferreol, A., Cosandier-Rimele, D., Merlet, I., Wendling, F., 2008. Brain source localization 
using a fourth-order deflation scheme. IEEE Trans. Biomed. Eng. 55 (2), 490-501. 

Amblard, P.O., Gaeta, M., Lacoume, J.L., 1996. Statistics for complex variables and signs—parts I 
and II. Signal Process. 53 (1), 1-25. 

Bienvenu, G., Kopp, L., 1983. Optimality of high resolution array processing using the eigensystem 
approach. IEEE Trans. Acoust. Speech Signal Process. 3 (5), 1235-1247. 

Birot, G., Albera, L., Ferreol, A., Chevalier, P., 2007. DO A estimation based on an even order deflation 
scheme. In: Proc. DSP’07. 

Cardoso, J.F., 1990. Localisation et Identification par la quadricovariance. Trait. Sign. 7 (5), 397-406. 

Cardoso, J.F., 1994. How much more DOA information in higher order statistics. In: Proc. 7th 
Workshop on Statistical Signal and Array Processing, pp. 199-202. 

Cardoso, J.F., Moulines, E., 1995. Asymptotic performance analysis of direction finding algorithms 
based on fourth-order cumulants. IEEE Trans. Signal Process. 43 (1), 214-224. 

Cardoso, J.F., Souloumiac, A., 1993. Blind beamforming for non-Gaussian signals. IEE Proc.-F 140 
(6), 362-370. 

Challa, R.N., Shamsunder, S., 1998. Passive near-field localization of multiple non-Gaussian sources 
in 3D using cumulants. Signal Process. 65, 39-53. 

Chen, Y.H., Lin, Y.S., 1994a. Cumulant-based method for bearing estimation in the presence of 
non-Gaussian noise. IEEE Trans. Antennas Propag. 42 (4), 548-552. 

Chen, Y.H., Lin, Y.S., 1994b. A modified cumulant matrix for DOA estimation. IEEE Trans. Signal 
Process. 42 (11), 3287-3291. 

Chevalier, R, Ferreol, A., 1999. On the virtual array concept for the fourth-order direction finding 
problem. IEEE Trans. Signal Process. 47 (9), 2592-2595. 

Chevalier, R, Albera, L., Ferreol, A., Comon, P., 2005. On the virtual array concept for higher order 
array processing. IEEE Trans. Signal Process. 53 (4), 1254-1271. 

Chevalier, R, Benoit, G., Ferreol, A., 1996. Direction finding after blind identification of sources 
steering vectors: the blind-MAXCOr and blind-MUSIC methods. In: Proc. EUSIPCO, pp. 
2097-2100. 

Chevalier, R, Ferreol, A., Albera, L., 2006. High resolution direction finding from higher order 
statistics: the 2^-MUSIC algorithm. IEEE Trans. Signal Process. 54 (8), 2986-2997. 

Chevalier, R, Ferreol, A., Albera, L., Birot, G., 2007. Higher order direction finding from arrays with 
diversely polarized antennas: the PD-2q-MUSIC algorithms. IEEE Trans. Signal Process. 55 
(11), 5337-5350. 

Chiang, H.H., Nikias, C.L., 1989. The ESPRIT algorithm with high order statistics. In: Proc. 
Workshop on Higher Order Statistics, pp. 163-168. 

Comon, R, 1994. Independent component analysis—a new concept? Signal Process. 36 (3), 287-314 
(special issue on higher-order statistics). 

Compton Jr., R.T., 1988. Adaptive Antenna—Concepts and Performance. Prentice-Hall. 

Dandawate, A.V., Giannakis, G.B., 1995. Asymptotic theory of mixed time averages and Xth-order 
cyclic-moment and cumulant statistics. IEEE Trans. Inf. Theory 41(1), 216-232. 

Demeure, C., Chevalier, R, 1998. The smart antennas at Thomson-CSF communications: concepts, 
implementations, performances, applications. Ann. Telecommun. 53 (11-12), 466-482. 

Dogan, M.C., Mendel, J.M., 1995a. Applications of cumulants to array processing—part I: aperture 
extension and array calibration. IEEE Trans. Signal Process. 43 (5), 1200-1216. 

Dogan, M.C., Mendel, J.M., 1995b. Applications of cumulants to array processing—part II: non- 
Gaussian noise suppression. IEEE Trans. Signal Process. 43 (7), 1663-1676. 




(^340^) Chapter ] 8 High-Resolution DOA Estimation with Higher-Order Statistics 


Fan, X., Younan, N.H., 1995. Asymptotic analysis of the cumulant-based MUSIC method in the 
presence of sample cumulant errors. IEEE Trans. Signal Process. 43 (3), 799-802. 

Ferrara Jr., E.R., Parks, T.M., 1983. Direction Ending with an array of antennas having diverse 
polarizations. IEEE Trans. Antennas Propag. 31 (2), 231-236. 

Ferreol, A., Chevalier, P, 2000. On the behavior of current second and higher order blind source 
separation methods for cyclostationary sources. IEEE Trans. Signal Process. 48 (6), 1712-1725. 

Ferreol, A., Chevalier, P, Albera, L., 2002. Higher order blind separation of non zero-mean 
cyclostationary sources. In: Proc. EUSIPCO 02, pp. 103-106. 

Ferreol, A., Boyer, E., Larzabal, P, 2004a. Low-cost algorithm for some bearing estimation methods 
in presence of separable nuisance parameters. Electron. Lett. 4015, 966-967. 

Ferreol, A., Chevalier, P, Albera, L., 2004b. Second order blind separation of first and second order 
cyclostationary sources—application to AM, FSK, CPFSK and deterministic sources. IEEE 
Trans. Signal Process. 52 (4), 845-861. 

Ferreol, A., Larzabal, P, Viberg, M., 2006. On the asymptotic performance analysis of subspace DOA 
estimation in the presence of modeling errors: case of MUSIC. IEEE Trans. Signal Process. 54 

(3) , 907-920. 

Forster, R, Nikias, C., 1991. Bearing estimation in the bispectrum domain. IEEE Trans. Signal 
Process. 39 (9), 1994-2006. 

Friedlander, B., 1990. A sensitivity analysis of the MUSIC algorithm. IEEE Trans. Acoust. Speech 
Signal Process. 38 (10), 1740-1751. 

Germain, R, Maguer, A., Kopp, L., 1989. Comparison of resolving power of array processing methods 
by using an analytical criterion. In: Proc. ICASSP. 

Gonen, E., Mendel, J.M., 1999. Applications of cumulants to array processing—part VI: polariza¬ 
tion and direction of arrival estimation with minimally constrained arrays. IEEE Trans. Signal 
Process. 47 (9), 2589-2592. 

Gonen, E., Mendel, J.M., Dogan, M.C., 1997. Applications of cumulants to array processing—part 
IV: direction Ending in coherent signals case. IEEE Trans. Signal Process. 45 (9), 2265-2276. 

Jacovitti, G., Scarano, G., 1994. Hybrid nonlinear moments in array processing and spectrum analysis. 
IEEE Trans. Signal Process. 42 (7), 1708-1718. 

Kaveh, M., Barabell, A.J., 1986. The statistical performance of the MUSIC and the minimum norm 
algorithms in resolving plane waves in noise. IEEE Trans. Acoust. Speech Signal Process. 34 

(4) , 331-341. 

Li, F., Vaccaro, R.J., 1992. Sensitivity analysis of DOA estimation algorithms to sensor errors. IEEE 
Trans. Aerosp. Electron. Syst. 28 (3), 708-717. 

Liu, T.H., Mendel, J.M., 1999a. Cumulant-based subspace tracking. Signal Process. 76 (3), 237-252. 

Liu, T.H., Mendel, J.M., 1999b. Applications of cumulants to array processing—part V: sensitivity 
issues. IEEE Trans. Signal Process. 47 (3), 746-759. 

Liu, J., Huang, Z., Zhou, Y., 2008. Extended 2q-MUSIC algorithm for noncircular signals. Signal 
Process. 88, 1327-1339. 

McCullagh, R, 1987. Tensor Methods in Statistics. Monographs on Statistics and Applied Probability. 
Chapman and Hall. 

Mendel, J.M., 1991. Tutorial on higher-order statistics (spectra) in signal processing and system 
theory: theoretical results and some applications. Proc. IEEE 79 (3), 278-305. 

Mosher, J.C., Leahy, R.M., 1999. Source localization using recursively applied and projected (RAP) 
MUSIC. IEEE Trans. Signal Process. 47 (2), 332-340. 

Oh, S.K., Un, C.K., 1993. A sequential estimation approach for performance improvement of 
eigenstructure-based methods in array processing. IEEE Trans. Signal Process. 41, 457-463. 




References 


Paulraj, A., Kailath, T., 1986. Eigenstructure methods for direction of arrival estimation in the 
presence of unknown noise field. IEEE Trans. Acoust. Speech Signal. Process. 34 (1), 13-20. 

Paulraj, A., Roy, R., Kailath, T., 1986. A subspace rotation approach to signal parameter estimation. 
Proc. IEEE 74 (7), 1044-1045. 

Picinbono, B., 1994. On circularity. IEEE Trans. Signal Process. 42 (12), 3473-3482. 

Porat, B., Friedlander, B., 1988. Analysis of the asymptotic relative efficiency of the MUSIC 
algorithm. IEEE Trans. Acoust. Speech Signal Process. 36 (4), 532-544. 

Porat, B., Friedlander, B., 1991. Direction finding algorithms based on higher order statistics. IEEE 
Trans. Signal Process. 39 (9), 2016-2024. 

Priestley, M.B., 1981. Spectral Analysis and Time. Academic Press. 

Proakis, J.G., 1995. Digital Communications, Third ed. McGraw-Hill. 

Roy, R., Kailath, T., 1989. ESPRIT-Estimation of signal parameters via rotational invariance 
techniques. IEEE Trans. Acoust. Speech Signal Process. 37 (7), 984-995. 

Scarano, G., Jacovitti, G., 1996. Applications of generalized cumulants to array processing. Signal 
Process. 53, 179-193. 

Schmidt, R.O., 1986. Multiple emitter location and signal parameter estimation. IEEE Trans. 
Antennas Propag. 34 (3), 276-280. 

Shamsunder, S., Giannakis, G.B., 1993. Modeling of non-Gaussian array data using cumulants: DOA 
estimation of more sources with less sensors. Signal Process. 30 (3), 279-297. 

Shamsunder, S., Giannakis, G.B., 1994. Signal selective localization of non-Gaussian cyclostationary 
sources. IEEE Trans. Signal Process. 42 (10), 2860-2864. 

Stoica, P, Nehorai, A., 1989. MUSIC, maximum likelihood and Cramer Rao bound. IEEE Trans. 
Acoust. Speech Signal Process. 37 (5), 720-741. 

Stoica, R, Nehorai, A., 1990. MUSIC, maximum likelihood and Cramer Rao bound: further results 
and comparisons. IEEE Trans. Acoust. Speech Signal Process. 38 (12), 2140-2150. 

Stoica, P, Handel, P, Nehorai, A., 1995. Improved sequential MUSIC. IEEE Trans. Aerosp. Electron. 
Syst. 31 (4), 1230-1239. 

Swami, A., Mendel, J., 1991. Cumulant-based approach to the harmonic retrieval and related 
problems. IEEE Trans. Signal Process. 39 (5), 1099-1109. 

Swindlehurst, A.L., Kailath, T., 1992. A performance analysis of subspaced-based methods in the 
presence of model errors, part I: the MUSIC algorithm. IEEE Trans. Signal Process. 40 (3), 
1758-1773. 

Yuen, N., Friedlander, B., 1996. Asymptotic performance analysis of ESPRIT, higher order ESPRIT 
and virtual ESPRIT algorithms. IEEE Trans. Signal Process. 44 (10), 2537-2550. 

Yuen, N., Friedlander, B., 1997. DOA estimation in multipath: an approach using fourth-order 
cumulants. IEEE Trans. Signal Process. 45 (5), 1253-1263. 

Yuen, N., Friedlander, B., 1998. Performance analysis of higher order ESPRIT for localization of 
near-field sources. IEEE Trans. Signal Process. 46 (3), 709-719. 





Chapter 9 



Source and Node Localization 
in Sensor Networks 


Chiao-En Chen, KungYao 


9.1 INTRODUCTION 

The technologies of information processing in the last 50 years that made the 
Internet and the Web possible, including modern microelectronics, have resulted 
in low-cost personal computers (PCs), servers, worldwide telecommunication, 
and computer-networking infrastructure. In the last 10 years, sensor network¬ 
ing has combined the technology of modern microelectronic sensors, embedded 
computational processing systems, and modern computer and wireless network¬ 
ing methodologies. It is believed that sensor networking in the 21st century will 
be equally significant by providing measurement of spatial-temporal physical 
phenomena, leading to a better understanding and utilization of this information 
in a wide range of applications. Sensor networking will be able to bring a finer- 
grained and fuller measurement (using acoustic, seismic, magnetic, infrared (IR), 
imaging, and video data) to characterize the world to be processed and commu¬ 
nicated, so that decision makers can utilize the information to take actions in 
near real time. 

Localization is the determination of the location of an object, whether that 
of one or more source(s) emitting some energy of interest (e.g., radio frequency 
(RF), acoustic, ultrasonic, optical, IR, seismic, thermal) or a sensor node in the 
sensor network (SN). It is a major issue of interest for many applications. Often, 
the localization of a source is of primary interest, while the localization of the 
sensor node is needed to support other functions, including one or more source(s). 
For example, the Federal Communications Commission (FCC) has mandated 
the E911 system (Europe has the El 12 system), requiring cellular telephone 
providers to determine the location of a cell phone user (in an emergency mode) 
to tens of meters. In various military scenarios, the localization of a vehicle; in 
automated manufacturing, the location of a robotic device; and in bio-complexity 
studies, the locations of birds and animals, are of great interest. 
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Before discussing source and node localizations in an SN, it is important 
to review some history. The SN as a concept and in realization appeared only 
around 2000 as the result of the accumulation of enabling technologies in the last 
50 years. The concept of a programmable digital computer was originated in the 
1940s. In the 1950s, mainframe electronic digital computers were built—they 
were expensive and were only available in a few educational, governmental, 
and commercial research organizations. At this time, basic concepts of digital 
communication also became known. In the 1960s, mini-computers became pop¬ 
ular and digital computations were made available to more users. In that period, 
satellite and terrestrial microwave communication made the transmission of large 
amounts of digital data possible. The concept of data transmission over a network 
of many nodes distributed over large areas was pioneered by researchers of the 
Arpanet. In the 1970s, microprocessors significantly reduced the cost of digital 
computations, and the availability of low-cost DSP chips made digital process¬ 
ing possible for many applications. Commercial and military communication 
and computer networks spread around the world. 

In the 1980s, PCs appeared, and the beginning of the Internet allowed 
researchers at a few research and large commercial organizations to easily com¬ 
municate. In the 1990s, optical communication networks and the availability 
of the Web browser allowed the explosive growth of worldwide communi¬ 
cations among individuals through the Internet. In this period, advances in 
embedded processors and wireless communication technology led to the cre¬ 
ation of ad hoc networks and explosive worldwide usage of cellular telephony. 
Around 2000, with all of these available technologies, sensor networking was 
made possible. In recent years, the intere st in SN at the research and develop- 
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conferences/wor kshop proceedings (A CM International Workshop on Wireless 
Sensor Networks: IEEE and ACM Inte rnational Symposium on Information Pro¬ 
cessing in Sensor Networks (IPSN), international Conference on Embedded 
Networked Sensor Syste ms (SenSvs)k and new SN journals ( A CM Transac¬ 
tions on Sensor Networks: International Journal of Ad Hoc and Sensor Wireless 
Networks ; International Journal of Distributed Sensor Networks', International 
Journal of Sensor Networks). 

A sensor network consists of dozens/hundreds/thousands of nodes (possibly 
randomly distributed), each with a sensor (acoustic, seismic, magnetic, chemical, 
image, video, temperature, etc.), a low-power embedded processor (of vary¬ 
ing processing capability), a radio (e.g., a low-power transceiver of varying 
capability and range), a battery, often of limited energy and size, and a pro¬ 
gram controlling one or more nodes and possibly some parts of the network to 
perform some given task. Constraints from all factors (distribution of the sensors 
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in unfortunate locations, limited processing capability, bad propagation condi¬ 
tions, nonuniform energy use among the nodes resulting in the early death of 
some, nonrobust networking protocols and network management software, etc.) 
can limit the proper operation of an SN and its ability to perform source and 
node localization. 

Another important factor in source localization is the physical nature of the 
source. In radar and wireless communications, the information signal is modu¬ 
lated on some high RF frequency, /o, for efficient transmission. In general, the 
bandwidth of the signal over [0 ,f s ] is much less than the RF frequency. Thus, 
the ratio of the highest to lowest transmitted frequency, (/o +/s)/(/o — /$)> is 
typically near unity. For the 802.11b ISM wireless local area network (LAN) 
system, for example, the ratio is 2.4835 GHz/2 GHz = 1.03. These waveforms 
are denoted as narrowband. Narrowband waveforms have a well-defined nomi¬ 
nal wavelength, and time delays can be compensated by simple phase shifts. 
The conventional narrowband beamformer operating on these waveforms is 
merely a spatial extension of the matched filter. In classical time-domain filter¬ 
ing, the time-domain signal is linearly combined with filtering weight to achieve 
desired high/low/bandpass filtering. The narrowband beamformer also combines 
the spatially distributed collected sensor array data linearly with the beamform¬ 
ing weight to achieve spatial filtering. Beamforming enhances the signal from 
the desired spatial direction and reduces the signal(s) from other direction(s) in 
addition to possible time-frequency filtering. 

The movement of persons, cars, trucks, wheeled/tracked vehicles, and vibrat¬ 
ing machinery can all generate acoustic or seismic waveforms. The processing 
of seismic/vibrational sensor data is similar to that of acoustic sensors except 
for the propagation’s medium and unknown speed. For acoustic/seismic wave¬ 
forms, the ratio of the highest to lowest frequencies can be several octaves. For 
audio waveforms (i.e., 30 Hz-15 KHz), the ratio is about 500, and these wave¬ 
forms are denoted as wideband. Dominant acoustical waveforms generated from 
wheeled/tracked vehicles may range from 20 Hz to 2 KHz, resulting in a ratio of 
about 100. Similarly, dominant seismic waveforms generated from wheeled vehi¬ 
cles may range from 5 Hz to 500 Hz, also resulting in a ratio of about 100. Thus, 
the acoustic and seismic signals of interest are generally wideband. However, 
even for certain RF applications, the ratio of the highest to lowest frequencies 
can also be considerably greater than unity. For wideband waveforms there is no 
characteristic wavelength and time delays must be obtained by interpolation. 

When an acoustic or seismic source is located close to the sensors, the wave- 
front of the received signal is curved and the curvature depends on distance; the 
source is thus said to be in the near field. As the distances become large, the 
wavefront is planar and parallel; the source is then said to be in the far field. For 
a far-field source, only the direction-of-arrival (DOA) angle in the coordinate 
system of the sensors is observable to characterize the source. Consider an ideal 
(noise-free) linear beamforming array with uniform inter-sensor spacing. For a 
far-field source, any adjacent sensors receive identical waveforms with only an 
equal time delay, which can estimate the DOA of the source. For a near-field 
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source, the collection of all relative time delays and the propagation speed of the 
source can be used to determine the source location. 

In general, wideband beamforming is considerably more complex than nar¬ 
rowband beamforming. Thus, acoustic source localization and beamforming 
problem is challenging because of its wideband nature, near- and far-held geome¬ 
try (relatively near/far distance of the source from the sensor array), and arbitrary 
array shape. 

The presentation in this chapter is as follows. The source localization problem 
is treated in Section 19^21 in which localization using trilateration and multilater- 
ation is first introduced in Section l9.2.11 The time-difference-of-arrival (TDOA) 
source localization method using the maximum-power blind beamforming and 
least squares is presented subsequently in Section 19.2.21 In Section 19.2.31 we 
briefly introduce the maximum likelihood formulation of the source localization 
problem and its theoretical performance bound. In Section 19.2.41 we introduce 
the energy-based source localization methodology. 

In Section 19.31 we introduce a few approaches that address the node local¬ 
ization problem. Noise-free node localization using SDP relaxation is first 
introduced in Section [9.3.11 Sections T9. 3. 21 and 19.3.31 present two formulations 
for cases where the distance measurement is contaminated by noise. Classical 
multidimensional scaling is introduced in Section 19.3.41 In Section 19.3.51 we 
describe a distributed algorithm based on the nonlinear Gauss-Seidel algorithm 
and present a few numerical simulation results. 

In Section l9Al we demonstrate a few experimental results of source localiza¬ 
tion and compare the performance of several algorithms using the implemented 
acoustic testbed. 


9.2 SOURCE LOCALIZATION METHODS APPLIED 
TO SENSOR NETWORKS 

9.2.1 Trilateration/Multilateration 


Trilateration is conceptually the simplest method for performing source localiza¬ 
tion of a single source based on the range estimates from several sensors. Here 
we assume each sensor node can estimate the range from the source to itself. 
There are various ways to make this range estimate in practical scenarios. To find 
the location of the source, we need to reference it with respect to some coordinate 
system. Furthermore, we must assume that the locations of the sensor nodes are 
known with respect to the same coordinate system. The problem of determining 
the locations of the sensor nodes is referred to as the node localization problem 
and will be discussed in Section [931 

In trilateration, we assume all range-sensing nodes and the source are situated 
on the x-y plane: range-sensing nodes A, B , and C are located at (xa, 3>a)> ( x B,yB), 
and (xc,yc), respectively, as shown in Figure[9j] If node A estimates the source 
to be at range dx and node B estimates the source at range ds , then the intersection 
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FIGURE 9.1 Trilateration based on three circles. 


of these two circles yields two possible ambiguous source locations marked by S 
and S '. Similarly, if node C estimates the source at range dc, then the intersection 
between node B and node C yields two possible ambiguous source locations 
S and S", while the intersection of node C and node A yields two possible 
ambiguous source locations S and S '". Clearly, the true source location (x,y) is 
at the intersection of all three circles at S , which satisfies the following equations: 

(• x-x A j 2 + (y-y A ) 2 =d 2 A (9.1) 

(x-x B ) 2 + {y-y B ) 2 =d^ (9.2) 

(x-x c ) 2 + (y-yc) 2 = dc (9.3) 

Of course, in the presence of noisy estimations of d\, ds, dc , the three circles 
will not intersect at a single point to yield a unique location S. As a result, the 
source location estimate is obtained through a least-squares (LS) fit given by 


X 


2 (x A -x c ) 2 (yA-yc) 

-1 

x A~ x C + yA-y 2 C + d C- d A 

_.y_ 


_2 (xb-x c ) 2(y B ~yc)_ 


x\-X 2 C +y 2 B -y 2 + d l~ d l _ 


In multilateration, N range-sensing nodes with known locations (x\,y\), 
(X2,y2), • • • , (xN,yN) are used. Let the distance measurement obtained at 
each node be d\, d 2 ,...,du. Then the least-squares solution can be obtained 
similarly by 


where 


r s = (A r A) 1 A r b, 
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(9.5) 
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(9.6) 
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9.2.2 TDOA Source Localization Methods 

The problem of determining the location of a source given a set of differential time 
delays between sensors has been studied for many years. Localization techniques 
include a two-stage procedure: TDOA estimation and TDOA-based source local¬ 
ization. Closed-for m solutions can be derived fo r the second stage using s pherical 


Schau and Robinsonl 1 19871 : 1 Smith and Abel . 


interpolation (SI) 

bolic intersection dChan and Hol . 1 19941) . or linear intersection jBrandstein et al. 


9871). hyper- 


19971) . However, t 
gation. IChen et al. 


l ese tech niqu es all requ i re kno wledge of the speed of propa- 
(12002ch and lYao et al.1 19981) derived both the closed-form 


LS and constrained least-squares (CLS) solutions for this. I t can be shown tha t 
wh en the speed of pr opagation is known, the LS method of IChen et al.1 d2002cl) 


and Yao et al. 


Huang et al.l (I200( 


coincides with the method independently derived later by 
l. Their method is mathematically equivalent to the SI tech¬ 
nique with less computational complexity. In theory, these two-stage methods 
can be used for multiple-source localization as long as the relative time delays 
can be estimated from the data. However, this remains challenging in practice. 
As a result, we will only consider single-source TDOA and source localization 
schemes in this subsection. 

In th e following parag raphs, we present the maximum-power beamforming 
method (I Yao et all 1 1998b for TDOA estimation and the LS and CLS methods 
for source localization. 


Maximum-Power Blind Beam forming 

Let there be M sources impinging on an array composed of P sensor elements. 
Denote the wavefront generated by the Mth source by s m (t), m = 1,..., M\ then 
the waveform received by the pt\\ sensor can be expressed as 

M 

x p (t) = s m ( r ~ + VptO, p=l,...,P (9.7) 

m= 1 

Here tp H) denotes the propagation time from the mth source to the pth sensor, 
and rjp(t) is additive noise with zero mean and variance cr . Figure 19.21 shows 
a simple example where M = 2 and P = 3. It is assumed that each sensor has a 
memory of L taps. 

We first consider the impact where only the wavefront from the first source 
(solid circles) is present. The impact of the second wavefront (dashed circles) is 
considered later. Let the received waveform be sampled at the rate of 1 /T samples 
per second. Without loss of generality, we can let T = 1. We also assume sensor 
1 as our reference point and the furthest away, followed by sensor 2, with sensor 
3 closest to the first source in Figure 1(01 It follows that the sampled data vector 
at the three sensors can be denoted as 


= [x\(n — n\),x\(n — n\ — 1),..., x\(n — n\ — L+ l)] r 


(9.8) 
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x (2) = \x2(n —112),X2O1 — n^ — 1 ), ...,X2(n — n^ — L+l)]' 


(9.9) 


x (3) = [x 3 (n-n3),X3(n-n3 - 1 ), ...,* 3 (n-w 3 -L+l)] 


(9.10) 


where n\ = 0 and 0<n^<n2<L— \ from our earlier assumptions. Define 

T 


x = 


x (i)r^ x (2)r^ x (3)r 


Then the space-time correlation matrix can be expressed as 


(9.11) 
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(9.12) 


In general, we want to find an algorithm that generates the beamformer output 
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(9.13) 
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whi ch satisfies some desired criterion. The maximum-power beamformer ( Yao 
et al.. 1 19981) chooses the weights such that the power of its output is maximized 
under the constraint || W 3 L || 2 = 1 , where 


T 

W3L = [wio, Wn, .Wi(L-l), W20, . W2(L-1), • ••, W 30 , • ..,W3(L-1)] (9.14) 


It is reasonable to expect that the combination corresponding to the largest output 
power is the one that sums coherently the strongest of the signals, to the disad¬ 
vantage of the weaker ones. Putting the optimality criterion into matrix form, we 
have the following maximization problem: 

maximize R 3 l W 3 l , subjectto Hw 3 z.ll = 1 (9.15) 


The unity constraint on the norm of the weights W 3 l ensures that the array 
output noise power is the same as the input noise power, and therefore the maxi¬ 
mization of Equation (19.151) is equivalent to the maximization of the signal-to- 
noise ratio (SNR) at the output of the array. The solution to Equation (19.151) is 
then given by the eigenvector corresponding to the largest eigenvalue of 
R 3 L in the following eigenvalue problem: 


r 3 ^ 3L) =4 3L) <C 


(9.16) 


where 0 < X ... < xf L) ... < xf^ . In practice, R3 l is implemented by the time 
average sample covariance matrix of x. The maximum-power criterion discussed 
here can be justifie d from the point of view of Szego’s asymptotic eigenvalue 
distribution theory ( Grenander and Szegfil 1958 ). A corresponding singular value 
decomposition (SVD) formulation of this problem has been us ed but is omitte d 
here. An efficient QLP approximation of the SVD, proposed by Steward ( 1999b . 
can also be used to find the weights of the maximum-power (MP) beamformer. 

The MP blind beamformer can be used to estimate the relative time delays 
from the dominant source with and without interference with the sensors. Con¬ 
sider a simulation using a field-measured track vehicle acoustic source with a 
spectral peak near 100 Hz plus an interferer modeled as a second-order autore¬ 
gressive (AR) source of coefficients <21 = —2*0.989cos (2tt *0.12) and a 2 = 
0.989 2 , resulting in a spectral peak near 120 Hz. The x-y coordinates of the three 
sensors, the tracked vehicle, and the interferer are set to {(12,0), (0,12), (—9,0)}, 
(7, —12), and (6.08, —8.438), respectively. The true time delays of the vehicle 
are at 12,7, and 5 samples, while those of the interferer are at 11,7, and 4 samples, 
among all the sensors. Figure l93l a) shows the time delay estimates based on 
the dominant eigenvector method using the MP blind beamformer; Figurel93lb) 


shows these estimates based on the classical correlation method (ICarter , 


19871) 


operating directly on the sensor data. For a signal-to-interference ratio (SIR) 
higher than approximately 4 dB, the eigenvector method finds the delays of the 
strongest source (the vehicle) with essentially no error. For an SIR of less than 
approximately 3 dB, the strongest source is the interferer and its delays are also 
found with essentially no errors. The estimated delay uncertainty region is only 
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FIGURE 9.3 Comparis on of eigenvecto r (a) and classical (b) correlation methods on time-delay 
estimation. ( Source: Fror mYao et all 19981 © 1998 IEEE, with permission.) 


about 1 dB. Fieure l93l b) shows that the classical correlation method yields much 
less accurate estimation results and that the estimated delay uncertainty region 
is about 20 dB. 

Least-Squares and Constrained Least-Squares 
Source Localization 

The problem of determining the location of a source based on a set of TDOA 
measurements is equivalent to the problem of determining the position of a 


receiver eiven eauallv biased ranee measurements to a set of satellites 

(Chafee 

and Abel, 

1994b. There are several wavs to solve the problem ( 

Bancroft 

,|l985). 


but no closed-form solution appears to be available when the propagation speed 
is unknown. This section introduces an algebraic solution to the problem of 
estimating both propagation speed and position. 

Denote the source location in the Cartesian coordinate of = [x s ,y s , Zs\ T 
and the pth sensor location at r p = [x p ,y p , z p ] T . Without loss of generality, we 
choose the first sensor as the reference sensor for differential time delays and 
let the reference sensor be the origin of the coordinate system for simplicity. 
The speed of propagation v in this formulation can also be estimated from the 
data. In some problems, v may be considered partially known (e.g., acoustic 
applications), while in others it may be considered to be unknown (e.g., seismic 
applications). 
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Let the relative time delay between the pth sensor and the first sensor be t p \; 
then 


tp l — tp t\ — 



(9.17) 


where p = 2,..., P. This is a set of (P — 1) nonlinear equations that makes finding 
the solution for r? nontrivial. 

To simplify the estimation to a linear problem, we make use of the following 
relationship: 

IITv Fpll - II Tv II ~I|J7 , I| - 7 (x s X r TXsTr T ZsZr) (9.18) 

The left side of Equation (19.181) is equivalent to 

(ll r i — r pll — ll r s||) (||r^ — rp|| + Hrsll) = (2||r s ||+v?pi) (9.19) 


in the case of ri = 0 . By combining both expressions, we have the following 
linear relation: 


r*r s +vtpi\\r s \\ + v 2 ^/2= ||rp|| z /2, p = 2,...,P 


p 


l P \ 


(9.20) 


With P sensors, we formulate the least-squares solution by putting the (P —1) 
linear equations of Equation (19.20b into the matrix form of 


Aiyi =b 


where 


and 


E ^21 ^2l/^ 

^*3 ^31 T3 1 /2 

_Y T P tpi t\ x ! 2_ 


yi = 



(9.21) 


(9.22) 


(9.23) 


The LS solution for the unknown vector yi is simply yi = Ajb, where AJ = 
(A[Ai) _1 A[ is the pseudo-inverse of Ai. The source location estimate is then 
given by the first three elements of yi, and the speed-of-propagation estimate is 
given by the square root of the last element of yi. 

In the three-dimensional localization problem, there are five unknowns in 
yi and therefore at least six sensors are required to obtain an overdetermined 
solution. However, placing sensors randomly does not provide much assurance 
against ill-conditioned solutions. The preferred approach is to use seven or more 
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sensors, yielding six or more relative delays, and to perform a least-squares fitting 
to the data. For a two-dimensional localization problem, the minimum number 
of sensors can be reduced by one. If the propagation speed v is known, then the 
minimum number of sensors can be further reduced by one. 

Note that the speed of propagation can also be estimated by 


v r, 


v = 


(9.24) 


using the fourth and first three elements of y i. To exploit this relationship, we can 
add another nonlinear constraint to ensure the equivalence between the speed of 
propagation estimates from the fourth and fifth elements of yi- We first rewrite 
Equation (19.211) as 


A2y2 = b + v 2 d 


(9.25) 


where 


i- T 
r 2 


A 2 = 


T 

^P 


hi 
hi 

tpi _ 


y2 = 


v||r 5 


(9.26) 


and 




l r 2 1 

|2- 


vt 2 n 
hi 

1 

b = - 
2 

r 3 2 

1 

, d = — 

2 

t 2 

hi 



ItpI 

|2 


A 

Hpi J 


(9.27) 


fi 


.2 At, 


The CLS solution for the unknown vector is then given by y 2 = A 9 b + v A 2 d. 

+ + 

Define p = A 9 b and q = A 9 d; then the estimates for the source location and 
propagation speed can be expressed as 


x s =pi+v 2 qi 

(9.28) 

y s =P2 + v 2 q2 

(9.29) 

Z S =P3+V 2 <13 

(9.30) 

ill =P 4 + V 2 q 4 

(9.31) 


where pi and qi are the ith entries of p and q, respectively. Using the relation 
1111 2 = x 2 -\-y 2 + z 2 and Equation (19.311) . we obtain the following third-order 
constraint equation: 


a(v 2 ) 3 + p(v 2 ) 2 + y(v 2 ) + 8 = 0 


(9.32) 
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where 


2,2,2 
u — q l +q 2 + q 2 

(9.33) 

p = 2 (piqi -\-p 2 q 2 +P3q3) - qi 

(9.34) 

Y=p\+pl+p\-2pm 

(9.35) 

s=-pl 

(9.36) 


There exist at most three solutions to the third-order equation. The estimate 
of propagation speed is then given by the positive square root of the real posi¬ 
tive solution of Equation (19.321) . If there are more than one positive and real roots, 
only one of the estimates in the range of feasible propagation speed consistent 
with the known propagation medium is used. From real-life data, we have not 
encountered multiple estimates within the range of feasible propagation speed. 

Once the speed of propagation is estimated, the source location can be 
obtained by setting 

x s =pi+v 2 qi (9.37) 

%=P 2 + v 2 q 2 (9.38) 

Z s =P3 + v 2 qi (9.39) 

Compared to the LS method, the minimum required number of sensors in the 
CLS method is reduced by one. 


Cramer-Rao Bound Analysis 

The Cramer-Rao bound (CRB) is one of the most well-known theoretical lower 
bounds on the variances among all unbiased estimators. Here we derive the CRB 
for the estimation of source location and speed of propagation with respect to 
the relative time-delay estimation error. 

We first denote the unknown parameter vector by = [rj, v] r , and model 
the time-delay measurement as 


t = tOIO + $ 


(9.40) 


where t = foi,..., tR\] T is the true relative time-delay vector and § is the esti¬ 
mation error vector, which is assumed to be zero-mean white Gaussian with 

r\ Of Of 

covariance = cr I. Define the gradient matrix H= = [^r, then H 
can be expressed as 


v 


(9.41) 


where B = [112 — ui,..., up — ui] r , and u p = (r s — r p )/\\r s — r p || denote s the 


l p' 


l p 


direct ion of the source from the pth sensor. The Fisher information matrix (IKav 
19931) can then be derived as 
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Fv I /=H r R“ 1 

1 ' 

cr 2 v 2 


H 

B r B 
—t r B 



(9.42) 

(9.43) 


The theoretical lower bound of the variances in estimating r s is given by 
the diagonal elements of the leading 3x3 submatrix of the inverse Fisher 
information matrix 

1 (9.44) 

where Pj 1 = I — tt r / 1| 1 1| 2 . The variance bound of the distance d between the esti- 

r\ _ 1 

mated source location and the actual location is given by crj > trace [F^ ] 11 : 33 . 
For the estimation of propagation speed, the variance bound is given by 


1 




_ 11:33 


2 2 
= (7 V 


B r P t x B 



> 





(9.45) 


where = I - B(B r B) - 1 B r . 

Analysis shows that the CRBs of both the source location and the propagation 
speed estimates grow linearly with the variance of the location estimate and 
v 2 . When the speed of propagation is known, the CRB for the source location 
estimate becomes a 2 v 2 trace [(B r B) -1 ], which is always smaller than that of 
the unknown speed of propagation case. We can define A rs = B r B as the array 
matrix , which provides a measure of geometric relations between the source and 
the sensor array. Poor array geometry may lead to a degeneration in the rank of 
A rs and thus result in a large CRB. 

By transforming to the polar coordinate system in the 2 — D case, the CRB 
for the source range and the DO A estimate can also be obtained. Denote r s = 
A 2 +y 2 as the distance between the source and the reference position such as 
the array centroid; then the DO A is given by (p s = tan~ l (x s /y s ) with respect to 
the y-axis. The time delay from the source to the pth sensor can be expressed 

as t p ~ ^Jrj + r 2 — 2 r s r p cos (cj) s — <fi p ) /v, where (r p , <fi p ) is the location of the pth 

sensor in the polar coordinate system. The gradient matrix H can be computed 
as 


3t 3t 3t 

H= —,—,— 
_ dr s d(j) s dv 

where 


3t 3t 

dr s ’ d(p s 


dt p \ 1 

~r s - r p cos (<p s - 4> p ) r s - r i cos (<p s ■ 


dr s v 

u 

1 

u 

-& 

1 

u 

_ 1 


dt p \ 1 

' r s r p sin (<p s -<t> P ) r s r\ sin( 0 s -<p\) 



ij-rpH lir^-ri 


1 


— v [ B pol » t] 


(9.46) 


(9.47) 

(9.48) 


d(j) s v 


(9.49) 
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The leading 2x2 submatrix of the inverse polar Fisher information matrix is 
then given by 


.-l 


Vr 


p° l J 11:22 


2 2 
= O V 




n — 1 


'pol 


(9.50) 


and the CRBs of the range and DO A estimates can then be obtained as [ F i]n 
and [F^J 22 , respectively. 

To evaluate the performance of the LS and CLS source localization algo¬ 
rithms, a simulated scenario, depicted in Figure I9.4f a). is used. A randomly 
distributed array of seven sensors collects the signal generated by a source mov¬ 
ing on a straight line. By perturbing the actual time delays by a white Gaussian 
noise with zero mean and standard deviation a t p = 10 /jl sec, the LS and CLS 
algorithms are applied to estimate the source location at each time frame. As 
depicted in Figure l9/flT >). the CLS yields much better range and DO A estimates 
than the LS on an average of 10,000 random realizations. The CLS estimates 
are also very close to the corresponding CRB. 

The speed of propagation estimation results for cr t d = 10/z and 50/zsec are 
compared in Figure 1931 a). The CLS is again dramatically better than the LS 
because of the additional nonlinear constraint equation. At a fixed source loca¬ 
tion (10,10), the range and DO A estimation errors are plotted for different levels 
of relative time-delay measurement error. We consider the same array of seven 
randomly distributed sensors, and then take away one sensor and use only the 
remaining six for comparison. The range and DOA estimation errors are linearly 
proportional to the relative time-delay error, as depicted in Figure [23b). The 
CLS is again shown to attain the CRB and to be much better than the LS. Perfor¬ 
mance also improves when an additional sensor is added to the overdetermined 
solution. 


9.2.3 Maximum Likelihood Source Localization 


Maximum Likelihood Formulation 


In this subsection, we present the maximum lik elihood solution (iBohmel 1 19861 : 
Chen et al. . 2002a : and Ziskind and Wax . 1988 ) for both near-field and far-field 
source localization. Consider the problem of localizing M wideband sources 
using a P-element arbitrarily distributed array (M <P). The sampled waveform 
received at the pth sensor can be described by the following time-domain model: 


M 

x P (n) = (n — n^^+rjpiri), n= 1,... ,L, p= 1,... ,P (9.51) 

m= 1 

Here denotes the signal level of the rath source at the pth sensor, ( n ) 

is the signal of the rath source at the reference point, and rip is the relative 
time delay (in samples) of the rath source at the pth sensor with respect to the 
reference point. 
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FIGURE 9.4 (a) Traveling source scenario and sensor scenario, (b) Range estimate (top) and 

DOA estimate ( bottom ) versus source locations ( cr t( / = 10/xsec). ( Source: From lChen et al.U2002c . 


© 2002 Sage, with permission.) 


The sensor noise r) p (n ) is modeled as a zero-mean, i.i.d. Gaussian process 
with variance a 2 . A block of L samples in each sensor’s data can be transformed 
to the frequency domain using the discrete-time Fourier transform, which results 
in the following array signal model: 

X(<z>) =D(<z>)So(<z>) + rj(co), —n <co <n (9.52) 


Here D(<z>) is the steering matrix at frequ ency co, and rj(co) is the noise spec¬ 
trum vector. Under certain assumptions (Ichen et ah . 2002a ) and using the 
discrete Fourier transform (DFT)/fast Fourier transform (FFT) implementation, 
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FIGURE 9.5 (a) RMSE speed of propagation versus source locations, (b) Range e stimate (top) 

and DOA estimate ( bottom ) versus source locations (r 5 = [10,10]^). ( Source: From I Chen et al 


2002cL © 2002 Sage, with permission.) 
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the maximum likelihood estimate for the considered source localization problem 
can be obtained by solving 

r s = argmax/(r 5 ) =argmax ||Pd(o) i ( : )X((W 4)|| 2 (9.53) 

ri ^ k 

where r v is the source location vector, and Pd(<u>.) = D(cuO[D^(<yOD(<w,t)] _1 
D H {co k ). 

Figure WM&) shows the simulation result of the near-field normalized J(r s ) 
metric of a vehicle source having high values about its location inside the convex 
hull of an array of five sensors. Thus, for near-field scenarios this metric is capable 
of estimating the source location. Figure I9.6r b) shows the simulation result of 
the far-field normalized /(r^) metric of a vehicle source having high values in 



X-Axis (m) 

(b) 

FIGURE 9.6 (a) Plot of J(r s ) of a near-field source capable of source localization, (b) Plot of 

J(r s ) of a far-field source capable of DO A source estimation. 









































































Source and Node Localization in Sensor Networks 



) 


the angular sector in the direction of the source. Thus, for far-held scenarios, this 
metric is capable of estimating the source DO A. The details of this algorithm 
as well as other sim ulation and field-measured estimated results can be found in 
Chen et ahl ( 2002b ). Note that the set of parameters to estimate can be augmented 


with the propagation velocity and noise variance, if these are unknown. 

This algorithm is fairly computationally complex, mostly because of the 
maximization of a function, /(r 5 ), that is usually nonconvex and displays a large 
number of local maxima. Moreover, all the sensors that participate in the estimate 
need to be synchronized and fully collaborative. The algorithm is centralized— 
that is, the relevant information has to be sent to a central node provided with 
sufficient computational power. These are issues in networks of low-power sen¬ 
sors, but they may not be a problem in environment-monitoring networks where 
nodes are more complex and may have higher available energy supplies. 


Cramer-Rao Bound Analysis 

In this subsection, we derive the Cramer-Rao bound for the DOA estimation of 
wideband sources in the far-field scenario. The Cramer-Rao bound for source 
location estimation can be derived under the same concept, but the analytical 
expression is more complicated and therefore will not be presented here. 
Consider the following far-field array signal model 


X(g>*) = D(o>jOSo(a>jO + y(o>k), k = 1,..., L (9.54) 

where D(^) = [d^(cuj0, • • •, d^ M \cok)], and rj(cok) is modeled as i.i.d. zero- 
mean Gaussian with unknown covariance = cr 2 1 . Define ^ = [0 T , Sq, cr 2 ] T as 
the vector of unknown parameters, where 0 = \0\ ,..., 0m] t denotes the DOAs 
of the M sources, and So = [Sq ..., Sq (col)] t . Using Equation (19.54k the 
(ij) element of the Fisher information matrix for the estimation of can be 
computed as 


r I N 


R _i 9R, i 9R^ 
" dr/n ” dfj 



db H / , 

\ 3b 

• +2 m< 

Hi ( Ia, / 2 ® R ’ 

' dirj 


(9.55) 


where 


b = [S 0 (co\) T D (coi) T ,..., S 0 ( co l ) t D (co l ) t ] 


(9.56) 


After some mathe matical manipulations, the Cramer-Rao bound expression 
( Chen et al. . 2007 1 


a 


CRBflo = —;)! 


E [E(ft>*) 


- 1—1 


H p_L 
r D {ok) 


E {(Ok) 


QRs(a> k ) 


_ k 


(9.57) 


can be obtained, where E (cok) = (&>£),. 


^&M(co k )],mdKs(co k ) = 


So(cOk)So(cOk) H . 

The CRB expression in Equation (19.571) contains contributions from all 
frequency bins through a direct summation. The contribution from each 
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frequency bin is an element-wise matrix product of the geometry factor, 
E(<^0^Pj)( WA )E(&>£), and the spectral factor, Rs(^). The geometry factor pro¬ 
vides a measure of geometric relations between the sources and the sensor array, 
while the spectral factor provides a measure of correlation among M sources. 
For frequency bins where no signals are present, the spectral factors are just zero 
matrices and th us do not contribute to CRB##, as is intuitively expected. We note 
that the CRB in Friedlander and Weiss ( 1993b is a stochastic CRB, in which the 
narrowband signal at each frequency b in is characterized by its covariance func¬ 


tion; in the CRB of lChen et al.1 (12007 ). as shown in Equation (19.571) . we treated 
the narrowband signal at each frequency bin as deterministic and unknown. A 
general CRB method for source localization based on the one-step approach 
(jointly process all the sensor data) and the two-step a pproach (measure s ignal 
parameters, then infer source location) were proposed ( Moore et al. . 2008b . 


9.2.4 Energy-Based Source Localization 


One of the key requirements of TDOA-based and many other source local¬ 
ization algorithms is synchronization among sensor nodes. However, accurate 
synchronization requires extra energy consumption and communication band¬ 
width, which can be costly for wireless sensor networks. Recently, energy-based 
approac hes were propo sed for acoustic source localization to relax such con¬ 
straints ( Li et al. . 2002b . This is possible because the acoustic power emitted by 
targets such as vehicles usually varies slowly with time and therefore a coarser 
sampling time can be used to relieve the burden of accurate synchron ization. In 
this subsection, we summarize t he main results oflSheng and Hul (12005b . For m ore 

( 2002 ). lLi and Hu ( 2003 ). and Sheng and Hu ( 2005 ). 


details, refer to Li et al 


Let P be the number of the sensors and M the number of acoustic sources in 
the sensor field. Then the received signal measured on the pi\\ sensor over the 
time interval n can be expressed as 


where 


x p {n) = s p in) + Vp (ri) 



(9.58) 


(9.59) 


is the signal intensity; v p (n) is the background noise modeled as a zero-mean 
additive white Gaussian noise with variance g^\ tp ll) is the propagation delay 

from the mth source to the pth sensor; and a^ (n — tp n ^) represents the intensity 
of the mth acoustic source measured 1 m away from that source and modeled as a 
random variable uncorrelated with other the intensities of sources. The notations 
p^ m \ Y p , and y p denote the position vector of the mth source, the position vector 
of the pth sensor, and the sensor gain factor of the pth sensor, respectively. 
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Spin) and v p (n) are assumed to be uncorrelated, and as a result the acoustic 
energy received at the pth sensor can be represented as 


M 


s 2 p(n) 


= Y 2 pH 


E a (m)2 (n - t ( p m) ) 


= 8pJ2 


=1 \\P (m) i n -tp m) )- r r 


(9.60) 


m= l llp (m) (n-tp m) )- r p 


(9.61) 


where g p = and (n — t p m ^) = E [a^ 2 (n — t p n) )]. In practice, the ensemble 

average is implemented using time average over a time window T = N/F S , where 
N is the number of s amples and F s is the sampling frequency. Under some 


practical assumptions (ISheng and Hu 
over the time window [t — ,t + can be modeled as 


20051) . the average energy measurement 


M 


ypif )—§p y ] 


4 m) (0 


m =1 \4 m \t) 


—y +£ p (t) 


(9.62) 


where d p m ) ( t ) = ||( t ) — r p ||. Note that has an x 2 distribution with mean g^ 
and variance 2 g*. When N is sufficiently large, (TV » 30), € p (t) can be approx¬ 
imated by a normal distribution, Af (/jL p , cr^), where fi p = and a = 2 g p /N. 
Using Equation (19.62b and omitting the time index t for brevity, the energy-based 
source localization problem can be formulated using the maximum likelihood 
criterion 


i^ = argmin£ (x/r) 

ijr 


(9.63) 


where 




and 


Z = 


t=[p T ,e, T ] 

£(f) = \\Z-GD £ s \\ 2 

(£i~in) (Sp-jip) 


(9.64) 

(9.65) 


o 1 


G = diag 
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(9.66) 


(9.67) 
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(9.68) 
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(363 ) 

(9.69) 

(9.70) 


Besides the maximum likelihood estimator (19.631 ) . many other energy-based 
source localization methods have been presented. The closest point approach 
(CPA), generally used for single-source localization and probably the simplest 
method, assigns the source location as the location of the sensor with the largest 
energy reading. It works reasonably well only when a sufficient number of sensors 
are deployed. Other energy-based source localization methods such as energy- 
ratio nonlinear least-squ ares (ER-NLS) a nd energy-ratio least-squares (ER-LS) 
have also been proposed (ILi and Hul . l2QQ3l) . The basic idea of ER-NLS and ER-LS 
is to determine target locations by intersecting the target location hyperspheres 
of every pair of sensors. Both can be cons idered as approximat ions of the ML 
solution (19.631) in a single-source scenario (ISheng and Hul 120051) . 


9.3 NODE LOCALIZATION METHODS APPLIED 
TO SENSOR NETWORKS 


A typical sensor network consists of a large number of small, inexpensive, col¬ 
laborative, and relatively autonomous sensor nodes. Theses sensors are deployed 
to collect local information and respond to changes in the environment by com¬ 
municating with neighboring nodes or the central station, making the sensor 
network a huge distributed computation machine. Applications featuring sen¬ 
sor network capabilities include earthquake prediction, agricultural monitoring, 
battlefield surveillance, and traffic control, among others. 

In many situations, it is useful for sensors to be aware of their relative or 
even absolute positions in the network. Since the number of sensors is usually 
large and the nodes are often randomly distributed, exactly locating all sensor 
positions is a nontrivial task. One possible solution is to equip each sensor node 
with a global positioning systems (GPS) device . However, this is not practical 
for the following reasons (ISavvides et all 1200 ilk 


• GPS cannot work indoors or in the presence of dense vegetation, foliage, or 
other obstacles that block the line-of-sight signals from the GPS satellite. 

• GPS devices introduce extra power consumption that shortens the lifetime of 
both the sensor nodes and the network as a whole. 

• The production of GPS devices is expensive, especially when large numbers 
of nodes are to be produced. 

• The size of GPS devices can be an issue when the nodes are required to be 
miniaturized. 


As a result, alternative solutions for node localization are required. 

Techniques for locating sensor nodes in a sensor network generally fall into 
two categories: localization with anchor nodes (or beacons) and anchor-free 
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localization (ISun et all 120051) . Anchor nodes are equipped with special posi¬ 
tioning devices (e.g., GPS) that are aware of their locations. For anchor-based 
localization techniques, the sensor nodes with unknown locations try to deter¬ 
mine their positions relative to the anchor nodes. Once their locations are 
estimated, they can be treated as anchor nodes and used to locate other 
nodes. Clearly, anchor-based localization requires another positioning system 
to initialize the estimation process. For anchor-free localization techniques, all 
nodes collaborate with each other (within the communication range) and then 
construct a relative map. To convert the relative map into absolute position 
information, postprocessing is required. 

Another classification of n ode localization can be made between central¬ 
ized and distributed algorithms (IPatwari et all 120051) . In centralized algorithms, 
all the information collected in the sensor network is transferred to a central 
node where the information is processed and the location of every node is esti¬ 
mated. These algorithms are often associated w ith some cons t rained optimization 
problems, and many of them are nonconvex dMoses et al. . 12003 ). Finding the 
global optimum without being trapped in local optimum solutions is in general 
challenging, especially when the size of the sen sor network increa ses. Convex 
fo rmulations using myriad relaxation techniques (IBiswas et alll2006l: Biswas and 
Ye. l2004l : lDohertv et all l200ll : lTsena 120071) have been developed and approxi¬ 
mate solutions are sought. In favorable conditions, these appro ximate solutions 
perfo rm reasonably well or even solve the original problem (IBiswas and Yel 


20041) . They can also be used as initializations for other iterative localization 


algorithms to further improve localization accuracy. 

Distributed algorithms are motivated by the following observations. First, 
there might not be a central node or the central node may not have the capabil¬ 
ity or computational power to handle a huge number of calculations. Second, 
forwarding all the information in the network to the central node can result in 
a communication bottleneck at the central processor. In distributed algorithms, 
each node runs a subset of the operations to obtain localization estimates and then 
passes them to neighboring nodes. The neighboring nodes then recalculate the 
localization estimates and pass the updates to their neighbors until the algorithm 
converges. To make an algorithm distributed, the convergence issue needs to be 
addressed to ensure that the global optimum is obtained. Exchanging informa¬ 
tion locally alleviates the communication bottleneck at the central node, but may 
increase total energy consumption compared to what is required to forward the 
information to the central node. It is conceivable that a trade-off can be achieved. 

Many loc alization algorithms rely on distan ce measurements between the 
sensor nodes ( Gustafsson and Gunnarsson . 20051) . These can be estimated based 
on signal characteristics such as received signal strength (RSS), time of arrival 
(TOA), or TDOA. Network connectivity information can also be used to esti¬ 
mate distance. For example, the distance between nodes can be estimated by the 


number of hop s required for the pac 
tination nodes (iNicolescu and Nath . 


:ets to travel from the source to the des- 


20011) . For the ease of later discussions, 
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it will be assumed that the distance measurement between a pair of nodes 
lying within the communication range can be obtained using any of these 
approaches. 

In the following sections, we first review formulations of the node local¬ 
ization problem and then introduce some of the important techniques such as 
multidimensional scaling (MDS) and various types of constraint relaxations that 
have been successfully applied. In the last part of this section, we prese nt a new 
distributed algorithm bas ed on the nonlinear Gauss-Seidel algorithm (Cheng, 
2006 : Cheng et al. . 2009b and a few simulation results. 


Without loss of generality, let us consider a two-dimensional(2D) node local¬ 
ization problem where all sources and sensors are located on a single plane. 
Let there be m anchor nodes with known locations e M 2 , k = 1,..., m, and 
n nodes with unknown locations x/ e Mr, i=l, The true Euclidean dis¬ 
tances between x* and xj and x/ and are denoted as dy and dik , respectively. It is 
assumed that measurements of the distance to all neighboring nodes are available 
only if these nodes are within radio range, R. Denote Wy and as the measure¬ 
ment errors for dy and d^, respectively; then the distance measurements can be 
expressed as 


dij = dij + w 


ij 


dik — dik T W ik 


(9.71) 

(9.72) 


9.3.1 Noise-Free Node Localization 

In the noise-free case, dy = dy and d^ = d^. Thus, the node localization problem 
can be formulated as 


Find xi,...,x w , 

s.t. || Xi-Xj\\=dy, V(/j) eN x (9.73) 

IIX/ a^ 11 = dik , V(/, k) £ N a 


where N x = { (, ij )| i < j; \\xy ue — x\ rue || <7?}; N a = {(/, k )| ||x, — || <R}', and 

x :\ rue is the true location of the it h node. Define X = [xi,..., x n \ e M 2xn . Then 
Equation (19.731) can be rewritten (IBiswas et alll2006h as 


Find 

s.t. 


nxn 


Xel 2x ",YeR 
(e i -e i ) T Y(e i -e j )=dfj, 


V(U)eV 




" a k " 

T 

I 

X" 


" a k " 

_ _e * _ 


X 
_1 

Y 


_ ~ e i _ 


= df k , V(i,k)eN a 


Y = XX 


(9.74) 


where e, is the vector of all zeros except the ith element, which is 1. The semi- 
definite programming (SDP) formulation of Equation (19.741) can be obtained by 
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first c 
et al., 


langing the constraint Y = X r X to Y ^ X r X. which is equivalent fBovd 


19941) to 


Z := 


I X 

X r Y 


^0 


(9.75) 


The notation A £= B means that A — B is a positive semi-definite matrix. Using the 
new vari able Z, the original p roblem is converted into a standard SDP feasibility 
problem (IB is was et all 120061) : 


(9.76) 


Find ZeR ( " +2)x(n+2) 

s.t. [1; 0; 0] r Z[l;0; 0] = 1 
[0; l;0 ] r Z[0 ; 1; 0] = 1 
[1; 1; 0] T Z [1; 1; 0] = 2 

[0; e, - e/ ] r Z [0; e,- - e,] = V(i,j) € N : 

[a k ; -e ( ] T Z[a^; -e,] = df k , V(i,k)eN a 

Z)p0 

The nota tion [x; y] is defined as the column vector constructed by stacking x on 




top of y. iBiswas et al 


(120061) further showed that, if there are 2n + n(n+ l )/2 


distance pairs, each of which has an accurate distance measure, then Equation 


(19.761) solves the original problem exactly. 

When the distance measurement is contaminated by noise, Equation (19.731 ) 
is not applicable since no feasible solution can be found in general that satisfies 
all constraints. One way of solving this problem is to assign certain distribution 
functions to the measurement errors w/ / and wn and reformulate the problem 
using the maximum likelihood criterion (IBiswas et alll2006l : lLiang et al.l.l2004l) . 


9.3.2 Maximum Likelihood Formulation 

Suppose the measurement errors follow the following distributions: Wy ~ 
Af(0, a?) and ~ Af(0, crj k ), where ^(0, a 2 ) denotes the Gaussian p.d.f. with 


zero mean and variance a 2 . Assuming all the measurement errors are inde¬ 
pendent, the maximum likelihood estimate for X can be obtained by solving 


Minimize 


J2 2 2 €ik+ J2 


(i,k)eN a ff ik 


0 i,j)€N x G ij 


S.t. 


\Xi-\j\\-di/j =e?-, W(i,j)eN x 


0 


X/ &k 11 dik ) — 


• \ —c 2 

[ k J ^ik ’ 


V (/, k) e N a 


(9.77) 
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Equation (19.771) is a nonconvex optimiza tion problem, but again admits an 
SDP relaxation problem dBiswas et al. I l2006l> of 


Minimize E ji €,k + E - 


where 


D 


v 


(i,k)eN a ff ik 


2 

(i,j)eN x ij 


s.t. 


dij, 1 


-iT 


D 


v 


dij', 1 


dik j 1 


iT 


D 


ik 


dik 5 1 


= €ij, V(lJ)€ 


= V(i,k)eN t 


a 


[0; e, — e 7 ] Z [(); e,- - e ; ] = Vy, V(i,j)eN- 
[a/ c ; -e,] r Z[a )fc ; -e,] = v tt , W(i,j)eN a 
Dij>0, V(iJ)eN x 
D,/t ^ 0, V(i,k)eN a 

Z^O 




1 “y 
L Ui J v 7 J 


, V(iJ)eN x , D ik = 


1 Uik 
Uik Vik 


, V(i, it) e V 




(9.78) 


The original prob lem (|9.77l) can then be solved by solvin g and rounding its SDP 
relaxation (19.781) dBiswas et al. . 2006 : Liang et al. . 12004 ). 


9.3.3 Minimization on the Sum of Absolute Errors 

Other than the maximum likelihood formulation, the node local ization problem 
can a lso be solved by minimizing the sum of the absolute errors dBiswas and Ye . 
2004) : 


E 

(i,k)^N a 


i M 2 V2 

\Xj-XjW -dy 


+ E lii x (' -a 'tii 2 -^ 2 


ik 


(i,j)eN x 


(9.79) 


Introducing the slack variable ads, Equation (19.791) can then be put into the 
following form: 

Minimize (a±+a^+ (<4+«a) 

(Uj)eN x (i,k)eN a 

s.t. (e,- — e ; -) r Y (e,- — e 7 ) — dy = ocj — aj , V(iJ)eN : 

I X 


(9.80) 


[a/ f ; -e,] 


Y = XX 


X 7 Y 


[a&; —e ( ] — df k =otj-a ik , V(i,k)e N a 


If we change the equality constraint Y = X 7 X of Equation (19.8011 to Y^X T X 


and use the inequality constraint (19.751) as before, the SDP relaxation of Equation 
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(19.801) is obtained dBiswas and Yel . [20041) as 

Minimize (at+a^+ (“£+“*) 


s.t. 


(i,j)eN x iMN a 

[1; 0; 0] T Z[1; 0; 0] = 1 
[0; 1; 0] r Z[0; 1; 0] = 1 
[1; 1; 0] r Z[l; 1; 0] = 2 

\T 


(9.81) 


[0; e ( - - e ; ] Z [0; e,- - e ; ] - dfj = - a tj , V ij eN x 


[a*; —e,] J Z[a&; -e,] - df t =at -oc,-,, Wi,keN t 


ik 


ik 


ik 


'a 


a ti’ a ij’ a tk’ a ik 

Z)p0 

The challenge in solving SDP relaxation problems lies in the fact that exist¬ 
ing SDP s olvers can only deal wit h small- to medium-size localization problems 
(n< 100) ( Biswas and Yel 2006 ). To overcome such difficulties, a distributed 
SDP method followed bv a gradient search approach has been proposed (Biswas 


and Ye. l2006l) . Other relaxation techniques have been considered in the literature 


such as second-order-cone programming (SOCP) relaxation (IDohertv et al 


20011) . The SOCP relaxation of Equation (19.791) can be formulated as 


s.t. 


E (yy-~ d i) 

+ E ( y ‘ k - dfk) 


(iJ)eN x 

(i,k)^N a 


^•>l|x/-x ; -|| 2 , 

V i,j e N x 

(9.82) 

}'ik — IIX/ <1/; , 

Vi,ke N a 



Theoretical and numerical studies of SOCP relaxation in sensor network 


localization were investigated in lTsend (120071) 


9.3.4 Classical Multidimensional Scaling 

Another family of node location algorithms is MDS, which aims to find a low¬ 
dimensional representation of a group of objects such that the distances between 
the objects fit a given set of measured pairwise “dissimilarities” as well as pos¬ 
sible. Here we introduce classical multidimensional scaling using Euclidean 
distance as the proximity measure. 

Let P = [pi,..., p n +m\ = [xi,..., x w , ai,..., a m ] be the coordinate matrix of 
all sensor nodes and 8ij and Sy be the true Euclidean distance and the distance 

measurement between p; and p y ; then the squared distance matrices, D and D, 
can be defined as [Dfij = <$?. and [D]/j = 5?-, respectively, V 0 < ij < n + m. Since 
<$?■ = p J p, — 2p^ r p j + p J p j, the matrix D can be expressed as 

D = }jr\ T — 2P r P + l^jr T 


(9.83) 
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where — [p^pi ,..., p^ +m p n +m\ T , and 1 is the vector of all ones. In classical 
MDS, the matrix D is then shifted to the center, resulting in a centered squared 
distance matrix B, 


B = (i —ii r /yv)D(i — 

= (I — ll T /N) P r P (i — n T / n) 

The node localization problem is then formulated as 

minimize ||B — Q r Qllf 

s.t. Rank(Q) = 2, VQeR 2x(,!+m) 


(9.84) 

(9.85) 


(9.86) 


The solution to Equation (19.86b can be obtained by performing the SVD 
on B: 

B = UdiagUi,A n+m }U r , k\ > A 2 > • • • > K+m (9.87) 

and then setting Q = diag{Ai, A 2 ,0,..., 0}U r . P is then recovered by the obtained 
solution Q up to a translation and orthogonal transformation. Given a sufficient 
number of anchors nodes (three or more in a 2D problem), this relative map can 
then be transformed back to the absolute map based on the absolute positions of 
the anchor nodes. For real applications, D of Equation (19.841) is replaced by the 
measured squared distance matrix D. 

One of the major drawbacks of classical MDS is the required pairwise 
distance measurements, which is not realistic for most power/bandwidth- 
constrained sensor network applications. Also, classical MDS minimizes the 
squared distances between <$?. and <$?• instead of the distances themselves. Because 

v v 

the distance measurement error increases proportionally with distance, minimiz¬ 
ing the squar ed distance can amp lify measurement errors and may result in poor 


performance (ICosta et al 


n amp ] 

2006h . 


To overcome these difficulties, many improved MDS lo calization schemes 
have been proposed. MDS-MAP(P) dShang and Rumlll2Q04l) decentralizes clas¬ 
sical MDS by using patches of local maps that can be built up in a distributed 
fashion and then merged to form a global map. The distributed weighted-MDS 
(dwMDS) algorithm minimizes a local cost function and is able to ada ptively 
emphasize the most accurate distance measurements ( Costa et al. . l2006 ). Other 
variants are abundant in the literature. 


9.3.5 Distributed Node Localization Based on the 
Nonlinear Gauss-Seidel Algorithm 


In t his section, we present a new distributed algorithm (IChengL 120061: Cheng 


et al., l2005h based on the nonlinear Gauss-Seidel algorithm ( Bertsekas and 


Tsitsiklis, Il989h . in which each node is responsible for computing and upda¬ 
ting its own location estimate based on the estimated locations of neighboring 
nodes and the locations of anchor nodes (when they are within radio range). 
Once the location estimate is updated, each node broadcasts it to all neighboring 
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nodes. It stops updating when the magnitude of the updating step is less than a 
certain threshold. Additionally, a simple scheduling algorithm updates the node 
localization estimates in a consistent order without the use of a global coordinator 
or a predefined routing loop. 

In this scheme node localization is first formulated as the following nonlinear 
least-squares problem: 


Minimize F(x \,..., x n ) 

where F(xi,...,x„) = X l J i*( x i)l 2 + X I'y ( x /> x /) 


(i,k) EN a 
,2 


HJ)eN x 


Vij (xj, Xj) = || X; - Xj II 2 - dfj, V(i j) € N x 
Sik (X/) = I|X,- - a* II 2 - df k , V(i, fc) e iV a 


(9.88) 


whi ch is commonly solved using the centralized Gauss-Newton method ( Moses 
et al.. 12003 ). The main advantage of this formulation is that it admits a simple 
distributed implementation and guaranteed convergence because the Jacobian 
and Hessian of the cost function F are continuous in the Mr plane. We also 
show in the simulation that when the measurement noise is small, the distributed 
implementation of Equation (19.881) yields better performance in terms of accuracy 
for a giv en number of updates. 

From lBertsekas and Tsitsiklis ( 19891) . the nonlinear Gauss-Seidel algorithm 
of the ith node for the problem in Equation (19.881) is described by 


(/-I - 1) 

x- = arg 


1 ™ n - F '( x i ,+1) ’ i= 1, ...,n (9.89) 


where t is the number of iterations. Note that, if (ij) £ N x or (j, i ) ^ N x , the solu¬ 
tion to Equation (19.891) will not depend on or x- r+ \ respectively. Therefore, 
we can simplify the optimization problem in Equation (19.891) as 

min F (t \xi), 


X/ 


(9.90) 


where f’ (r) (x,)= X O'i ( x f +1) > x i) + X | r y( x i> x f ) ) 

(j,i)eN x (J,j)eN x 

+ X I^Cxi)! 2 

( i,k)GN a 

Given x ■ ^, the estimate of node V s location at iteration t , we can approxi¬ 


mate Equation (19.90b as a linear least-squares problem using the Gauss-Newton 
method: 


£ 

T 

O7 + (V X/ O-0 g i 

2 I T 

+ X r y + ( Vx > r y) & 

[U,i)eN x 


d,j)€N x 


^ rj~i 

+ 2^ s ik + (Vxi s ik) g i 

(i,k)^N a 


(9.91) 
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where g,- = x,--x}°; rj,- = /},-(xj' +1) ,xf ) ); r,y = r,j-(xf } ,x, w ); s ik = s ik (xf } ); and 
V X/ is the gradient with respect to x*. Note that Equation (19.911) is now a least- 

squares problem with only two variables. Let g? be the solution to Equation 
(19.911) : the update process estimate of sensor i is given by x- r+1) = x- r) +o'/g- r \ 
where a/g? is the step length. 

The value of a?,- can be found using a backtracking line search ( Bovd and 
Vandenberghe. l2004t) . That is, find an oti such that 


F 


(0 


x, W + a. 


'ft*' 1 ) 




(0 


f) 


+ c'Q', Vv, F U) (x 


f) 




(9.92) 


where 0 < c < 0.5 is a predetermined constant that can be interpreted as the 
fraction of the decrease in predicted by a linear extrapolation that we 
will acce pt. This is known as the damped Gauss-Newton method ( Dennis and 
Schnabel, 1 19961) . Note that only local information is required to compute the 
step gj and find the step length a*. After the location estimate is updated, node 
i broadcasts it to all neighboring nodes. For each node, the update terminates 
if || g, W II < 6, where € is a predetermined threshold. Once the node decides to 
terminate the updated process, it sends out a message to its neighboring nodes 
so that the rest of the nodes can keep updating using the broadcasting node’s last 

location estimate x^ nal ^ without waiting for an update. The updating process 
for each node from iteration t to iteration t + 1 is summarized in Algorithm 9.1. 
Since V x ,.r,; = 2(xf ) -xf +1) ), V : 


j 


i r ii = 2(xp - xj°), \7 Xj s ik = 2(xf } - a*), and 


x,' i] 


A 7 A is a 2x2 matrix, the computation required for each node is minimal 
and thus the algorithm can be implemented on sensor nodes that have basic 
processing capabilities. 


ALGORITHM 9.1 Gauss-Newton Update Process 


Let A = [V X/ /)■/,..., Vx/zy,..., V X/ s //o .. .] T , b = -[/)■/,..., rjj, ..., s jk ,...]. 
Compute the step g- f) = (A^Ai^A^b. 

Compute the gradient V X/ F / ^(x-^) = A 7 "b. 


If llg 


(0 


> € 


While FV\x?Ha i g) t) )>Fy\xt) + ca i V Xi FV\x) t yg) 


,(0 


(0 


■(f) /w(0\7'„(f) 



end 

Update the estimate x| f+1 ’ =x ; (f) +a/g-^. 

Broadcast the updated estimate x^ +1) to the neighboring nodes. 

else 


Stop update. 

Broadcast x ( j fmal) to the neighboring nodes. 


end 
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It is possible to further reduce the amount of time for all nodes to update 
from t to t +1 while satisfying the Gauss-Seidel algorithm by exploiting the 
network structure. Note that in Equation (19.901) only the location estimates of 
the neighboring nodes {xj r+1 ^|(j, i) eN x } and {xj^|(/,y) e N x } are needed for 
updating node i. Therefore, node i can determine locally when to update itself 
without waiting for other nonneighboring nodes. We propose a simple scheduling 
algorithm that satisfies the Gauss-Seidel updating rule as follows. 

Scheduling Algorithm. Node i starts to update its estimate x-^ if one of the 
following criteria are met: 

• Node i receives {x- ?+1 ^ or x^ nal ^\j <i, (j,i)eN x } and {x-^ orx-^^l 
j>h HJ)eN x }. 

• The timeout threshold is reached. 


In the ideal situation where all updated messages are received by neighboring 
nodes, the first criterion ensures that the nodes update according to the Gauss- 
Seidel algorithm. However, if some of the updated messages are not received 
by the intended neighboring nodes, applying only the first criterion will halt the 
entire updating process. Therefore, the second criterion is used to ensure that the 
updating process resumes in the event of outage in the wireless transmission. In 
this case, a timer is used to record the idling time from the last update. Once the 
idling time reaches the timeout threshold, the node will stop waiting for update 
messages from its neighboring nodes and start the update using the last received 
information. 

Two properties result from this scheduling algorithm. First, each node waits 
only for its neighboring nodes, instead of {j\j < i) as in the sequential case. Thus, 
a subset of nodes in the network can update simultaneously if they are not each 
other’s neighbors. Second, it can be seen that two connected nodes cannot be 
updated at the same time. Thus, when one node is updating and broadcasting, its 
neighbors are not. This reduces the probability of message collision. 

The proposed sched uling algorithm can be demonstrated by the simple exam¬ 
ple shown in Figure l9Jl(IChend.l2QQ6t) . Consider a network of six nodes and seven 
edges (or connections). Assume that {x-^|(z,y) eN x } are available to node i for 
i = 1,..., 6. Denote T = 1,2,... as the instance of time. The processing time for 
each node to update from iteration t to iteration t + 1 is one unit. At T = 1, node 
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1 and node 2 will start to update simultaneously since both satisfy the updat¬ 
ing rule. This is shown in Figure l9Jl a). After nodes 1 and 2 have updated and 
broadcast their estimates x^ +1) and x^ +1) to their neighbors, nodes 3, 4, and 6 
will start to update simultaneously at T = 2, as shown in Figure E2b). Finally, 
at T = 3, node 5 will start to update after it receives the updated estimates from 
nodes 3 and 4. Notice that in Figure l9Jl c). node 1 will also start to update again 
to iteration t + 2 since it received the updated estimates x^ +1) and x^ +1) . In this 
simple example, it only takes three units of time to update all six nodes from 
iteration t to it eration t + 1, w hile in the sequential case it would be six units. 

Figure [9781 ( Cheng! 2006 ) shows the average total processing time for the 
proposed scheduling rule to update all nodes in randomly generated networks 
as the size of the network increases. In this simulation, the node locations are 
uniformly distributed over an area of 1.6^/n/100 x 1.6^/100, where n is the 
network size (total number of nodes) and 100 realizations are used for each net¬ 
work to compute the average processing time. For simplicity and demonstration, 
we assume that all nodes have the same processing time in this simulation. In 
practice, the scheduling algorithm does not require such an assumption. 


Simulation Results 

Consider a network with n and n /10 randomly placed sensor nodes and anchor 
nodes, respectively. If the distance between two nodes or a node and an anchor 
node is less than radio range R , a noisy measurement of the distance is available 
and is given by dij =dy + Wy, where w# is the noise modeled as zero-mean 



Network Size ( n ) 


FIGURE 9.8 Average processing time for the proposed scheduling rule. 
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Gaussian noise with standard deviation related to the accuracy of the distance 
measurement. We evaluate the performance of the algorithm by the average 
normalized error (with respect to radio range) of the estimated location to the 

true location: ^ Xa=i ll x f^ — xj. 

Figure 19.91 shows examples of the algorithm with 60 sensors and 6 anchors 
placed randomly in a 0.6 x 0.6 region. The radio range is set to 0.35 and = 0.02. 
The starting points are chosen as random perturbations from the true sensor 
locations, where the perturbation is given by a uniformly distributed random 
variable in [—0.2,0.2]. The normalized error for the starting point is 0.4628. The 
x represents the estimate of the true node location represented by the o. Anchor 
nodes are represented by o. The normalized error after 10 iterations is 0.0456. 
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FIGURE 9.9 Realization of node localization for 60 sensor nodes and 6 anchor nodes: (a) initial 
estimates and (b) estimation results. 
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(a) (b) 

FIGURE 9.10 Comparisons of normalized errors of radio range (a) and noise level (b) for three 
node localization algorithms. 



Initial Estimate Error 



(a) (b) 

FIGURE 9.11 Comparisons of normalized errors of initial estimate (a) and number of anchor 
nodes (b) for three node localization algorithms. 


Figures 19.101 and l9JJl compare the proposed distributed algorithm with two 
other localization methods: dist ributed Gauss-New ton and the maximum likeli¬ 
hood formulation and dwMDS ( Costa et al. IEqo3) methods. In this simulation, 
100 nodes and 10 anchors are randomly placed in a 0.6 x 0.6 region. Each node 
in the network updates 20 times, t = 20, and the normalized errors are averaged 
over 20 random realizations. Figure l9T0l a) shows the comparisons for different 
radio ranges ( R ) in the absence of measurement noise. Figure I^TTOlF O shows the 
comparisons for different measurement noise levels (oj) with radio range R set 
to 0.3. Notice that in situations where the noise levels are small, the proposed 
algorithm yields better estimation accuracy for a given number of iterations. 
However, as the noise level increases, the nonlinear least-squares formulation 
becomes suboptimal, as expected. 

Figure l9J~T1 shows the sensitivity of the localization algorithms to initial esti¬ 
mate error and total number of anchor nodes. In both simulations, the radio 
range R is set to 0.3 and the noise level is set to 0.01. It can be seen 
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that the proposed distributed algorithm is less sensitive to the initial estimate 
error and yields better estimation results for a given number of anchor nodes 
when compared with the two other methods. An interesting observation from 
the previous four simulation results is that the distributed algorithm with the 
maximum likelihood formulation behaves almost identical by to the dwMDS 
algorithm. 


9.4 SOURCE LOCALIZATION APPLICATIONS USING 
SENSOR NETWORKS 

In this section, we describe some experimental results from using the DOAs 
to perform source localization in the far field. In lChen et al. ( 2003 ). we built a 
wireless acoustical testbed using Compaq iPAQ 3760 Pocket PCs as the testbed 
nodes. Each iPAQ had a microphone, an ADC, a codec, and an 802.11b wireless 
card. The combination of COTS hardware and open-so urce operating system 
was used to implement both the TDOA-CLS and AML ( Chan and Hnl 11994 
DO A estimation algorithms. The choice of these devices, which admittedly 
may be quite different from the nodes of a sensor network, was motivated 
by the desire to set up a working testbed that would allow the generation 
of significant data as well as the development and testing of localization 
algorithms. 

In the first experimental setting, the source (a computer speaker) was placed 
in the middle of a wirelessly connected square array (each side of length t) of the 
iPAQs. In this near-field case (relative to the array), the four nodes acted as one 
array with inter-sensor spacing l of 20 ft (6.1 m). The sound of a moving light¬ 
wheeled vehicle was played through the speaker and collected by the microphone 
array embedded in the iPAQs. Figure l9J~2l a) shows the direct localization results 
of the speaker using the AML method. A similar experiment was conducted using 
the same configuration except that i was 40 ft (12.2 m). This time the loudspeaker 
played prerecorded organ music, the spectrum of which had a 2-kHz bandwidth 
with a central frequency at 1.75 kHz. The AML source localization result is 
shown in Figure l9J"2lT )). From the RMS errors of the previous two experiments, 
both algorithms show comparably promising results. 

The second experimental setting is depicted in Figure l9J"3f a). where three 
triangular subarrays, each with three iPAQs, formed the sensor network. In this 
far-field case (relative to each subarray), the DO A of the source was indepen¬ 
dently estimated in each subarray using a far-field version of the ML estimator 
described in Section 19.2.31 The location estimate was subsequently determined 
by crossing these bearing estimates and performing a least-squares fit. In this 
experiment, the speaker was placed at four distinct source locations S 1,..., S4, 
simulating the movement of the source, while the same vehicle sound was 
played each time. Figure I9.13IT )) depicts one snapshot (for clear illustration) 
of the AML and TDOA-CLS results at the four locations. We note that better 
results were clearly obtained when the source was inside the convex hull of the 
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FIGURE 9.12 AML source localization: (a) vehicle source and (b) music source. ( Source: From 


Chen et al.U2003l © 2003 IEEE, with permission.) 


overall array. Moreover, the RMS errors of Figure 19.121 are much less than the 
RMS errors here: The RMS errors of direct source localization are much less 
because of the favorable geometry, the shorter ranges, and the fully coherent pro¬ 
cess in contrast to coherent DO A estimation and the noncoherent cross-bearing 
process. 

A few more far-held cases are considered. In the third experimental setting, 
depicted in Figure|9T4ja), three linear subarrays, each with three iPAQs, formed 
one sensor network. The speaker, this time playing the organ music, was placed 
at six distinct locations. Figure l9T4T b) shows the one-snapshot results of the two 
algorithms at the six locations. The RMS error calculation shows performance 
similar to the second experiment, which demonstrates that both AML and TDOA- 
CLS can locate different sources. 
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In the fourth experimental setting, depicted in Figure l9J~5l four square subar¬ 
rays, each with four iPAQs, formed a single network. Two speakers, one playing 
the vehicle sound and the other one playing the organ music simultaneously, 
were placed inside the convex hull of the sensors. Figure l9T6f a) shows the AML 
localization results when only the vehicle source was active, while Figure l9T6l b ) 
shows the AML localization results when both sources were active. Comparing 
Figure HJ3 a) and l9.16r bk we found that good multisource localization results 
can be obtained by AML, but not as good as the performance with only a single 
source. When the number of subarray elements increases, localization accuracy 
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also improves, which agr ees with the Cramer-Rao bound analysis reported in 
Stoica and Nehorai ( 1989 ). 
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Chapter 10 



Direct Position Determination: 
A Single-Step Emitter 
Localization Approach 

Alon Amar, Anthony J. Weiss 


10.1 BACKGROUND 

Emitter localization attracts significant interest in the signal processing, radar, 
sonar, bioengineering, seismology, and astronomy literature. Emitter location 
techniques are currently used for many purposes, such as emergency cellular 


(and law breaking), fraud detectior 

l, and homeland security ( 
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all 12005: 

Gustafsson and Gunnarsson 

i 20 

05 

Patwari et al.. 2005: Saved et al.. 

2005: Sun 

et al., 

2005; 

Zaeami et al. 

, i 

998 

). 


The localization process is based on the exchange of signals between the 
emitter and a number of reference stations. There are mainly two types of posi¬ 
tioning situations: (1) self-positioning, where emitter position is determined 
based on the transmitted signals from the stations; and (2) remote positioning, 
where the system determines emitter position. The main concern in this chapter is 
remote positioning. However, the methods discussed here can be easily applied 
to self-positioning. 

The traditional methods for determining remote position of a radio frequency 
(RF) emitter are based on a two-step procedure (see Figure flOTTI) : 

Step 1 Estimation of position-related parameters such as angle of arrival 
(AOA), time of arrival (TOA), time difference of arrival (TDOA), received 
signal strength (RSS), and Doppler frequency shift. 

Step 2 Determination of position based on the estimates that were obtained in 
step 1. The unit performing this step is referred to as a location processing unit 
(LPU). 


Classical and Modern Direction-of-Arrival Estimation 
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FIGURE 10.1 Two-step position determination scheme. 
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FIGURE 10.2 One-step position determination scheme. 


Another approach for remote position determination is based on a one- 
step procedure where position is estimated directly from observations (see 
Figure flQ^2l) . 

Surprisingly, the two-step approach is common practice in most systems 
using spatially distributed stations. This trend is a result of a five factors: 

• Tradition. 

• The transmission of the first step results to the LPU requires a small amount 
of bandwidth. 

• The (unjustified) belief that all information on position is included in the 
parameters of the first step and therefore nothing better can be done. 

claim that the maximum likelihood (ML) invariance 
199(lh supports the two-step approach. 

• In many cases of interest the variance of the two-step error approaches the 
Cramer-Rao lower bound (CRLB) as the data set increases. This leads to the 
false belief that the two-step approach cannot be outperformed even for short 
data sets or low signal-to-noise ratio (SNR). 


The (unjustifiec 
theorem (I Schart 


10.2 LOCALIZATION FOR STATIONARY GEOMETRY 


Perhaps the first pa per on the mathematics of emitter location, using AOA, was 

uding a fine review 


St; 


nsfield’s ( 19471). Many other pub l icatio ns fo 


(1194 

dl984 


lowed, inc 


bv lTorrieril (ll984l) . lKrim and Vibergl i 19961) and lWaxI (119961) presented compre¬ 
hensive revi ews on antenna array processing for location by AOA. Recently, 
Treesl (120021) published a book fully devoted to array processing. 

3v TOA is used extensively in cellular phone localization fLiberti 


Positioning 


(Carterl 

1993 

). Its traditional tech 

niques can be classifi 

ed as decentralized pro- 

cessing ( 

Kozick and Sadler. 
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. Wax and Kailath 

(1985 

) discussed eigenstructure algorithms 


for narrowband signals observed by multiple arrays, assuming p erfect spatial 


coherence across each array but no coherence between arrays. IStoica et al. 
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( 19951) prop osed variants of th e direction estimation algorithm for decentralized 
processing. IWeinsteinl (119811) discussed pairwise processing as an alternative 
to centralized processing of a wideband single array. His results indicated that 
pairwise processi ng may be used at high SN R without significant loss of perfor¬ 
mance. Recently. iKozick and Sadler (2004) presented a performance analysis of 
localization of a single source using multiple distributed arrays. They assumed 
perfect spatial coherence over each array and frequency-selective coherence 
between arrays. Their proposed method is based on bearing estimation at each 

array and delay esti mation. _ 

Wax and Kailath ( 19831) . that mea- 


As indicated in 


suring AOA/TOA at each base station separately and independently is sub- 
optimal, since this approach ignores the constraint that the measurements 
must correspond to the same source position. Moreover, the base stations 
are geographically separated, and therefore the desired signal is often weak 
or absent in some of them. Thus, the system must somehow ensure that 
all AOA/TOA measurements used to locate a specific source correspond to 
the same source. In the case of cochannel simultaneous sources, the localiza¬ 
tion system confronts an association problem of deciding which of the multiple 
AOA/TOA estimates reported by the base stations correspond to which source. 

We now discuss the direct position determination (DPD) approach, which 
solves the localization problem using the same data collected at all base stations 
together. The DPD method takes advantage of the rather simple propaga¬ 
tion assumptions usually used for RF signals. We assume line-of-sight (LOS) 
propagation with unknown complex attenuation at each base station. 

According to the DPD approach, each base station transfers the intercepted 
signals to the LPU, where the set of positions that best matches all collected 
data simultaneously is determined. Although there are many stray parameters, 
only a two-dimensional (2D) search is required for a 2D geometry. Positioning 
with DPD is performed in a single step. For illustration, Figure [1031 shows the 
cost function of a planar geometry (receivers and emitters are all in a single 
plane) consisting of four base stations, each equipped with an antenna array, and 
three emitters. DPD determines the emitter’s positions by selecting the peaks 
of the cost function in a single grid search. Performance comparison of DPD 
and the AOA-based technique for unknown as well as known waveforms shows 
that, for high SNR, both methods converge to the CRLB while, for low SNR, 
DPD achieves better accuracy. The DPD estimator does not use the first step of 
the conventional methods and therefore does not encounter the requirement of 
associating measurements to sources. 


10.2.1 Problem Formulation 

Consider L stations located at geographically separated sites within a plane. Each 
station is equipped with an antenna array that consists of M sensors. Denote by 

= [x q ,y q ] , q = 1,..., Q the position of the qth source. The vector of signal 
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X-Axis (m) 

FIGURE 10.3 Contour description of the DPD cost function for four base stations, each with an 
antenna array, and three emitters. 


complex envelopes observed by the elements of the i th array is given by 

Q 

r l( t ) = 'Yh b q,l a q,Z s q(t-T q ,l) + ni(t), 0 <t<T (10.1) 

q= 1 


where T is the observation time, b q ^ is an unknown complex scalar representing 
the attenuation of the gth signal as observed by the fth array, s q (t) is the gth 
wideband signal waveform, x q x is the propagation delay from the gth emitter to 
the reference point of the t th array, (t ) represents the noise and interference 
observed by the array, and is the Mxl column vector that denotes the 
Ith array response to the qth source. We assume that the noise at the output 
of each of the array elements is a wide-sense stationary, zero-mean, complex 
Gaussian process, uncorrelated with the noise at the other antenna elements and 
uncorrelated with the signals. 

The first stage of the processing consists of partitioning the observation 

interval into K sections, each of length T' = T/K. In Amar and Weiss ( 2006 . 
Appendix G), it is shown that this procedure holds if T' ^max{At 


max? ‘'max 


}. 

where Af max is the propagation time between the two most spatially separated 
stations, and r max is the maximal correlation time of the processes. Denote by 
ri(k, u ), s q (k , u ), and n^(k, u) the uth Fourier coefficients of the kth section of 
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r i(t), s q (t ), and n i(t), respectively. Because the observed signal is a stationary 
process, it can be represented by a set of Fourier coefficients: 


r i(k,u) = 


1 


Vr 


kT' 

/ 

(k-l)T' 


r i(t)e ut dt 


( 10 . 2 ) 


where k = 1,..., K, and co u = ^u, u = 1,..., J. The Fourier coefficients of 
Equation (110.11) satisfy 

0 

ri(k, u ) = E b q ,ea q ,e( u ) s q( k ’ u) + ni(k, u ) (10.3) 

q= 1 


where we define 

a q ,i(u)=a q ,ie- iWuTqX (10-4) 

The noise vectors ni(k, u) are independent and identically distributed (i.i.d.), 
zero-mean complex Gaussian vectors such that Rn(w) = rj u I, where rj u is the 
noise spectral level at frequency co u . Define 

r (k, u) = [r^ (k, u ),..., r£(k, m)]^ 

n(k, u) = [iif ( k , u ),..., n£(k, u)] T (10.5) 

A T 

s(k, u) = \s\(k, u ),..., SQ(k, t/)] 

and 

A (u) = [Ai(m), ..., A q (u)\ 

a q (u) = Diag (a 9 ,1 (m),..., a 9 ,z,( m)} (10.6) 

B = Diag{bi, ...,b e } 

A 

bg = 1 5 bq, 2 i • • ■ 5 

where Diag stands for block diagonal. We can now rewrite Equation (110.31) as 

r (k, u) = A(u)Hs(k, u) +h(k, u) (10.7) 


We are mainly interested in estimating the positions, and therefore all the other 
unknown parameters are nuisances. 

The transmitted signals may be considered deterministic or random with 
known distribution (usually zero-mean complex Gaussian). Obviously, the prob¬ 
ability distribution function (p.d.f.) of the observations in Equation (110.71) is 
affected by the assumption on the signals. However, in both cases the log likeli¬ 
hood function (LLF) of the collection of all observations in Equation (110.71) is 

k J 

tip)= EE^ f( ^’ u ^ 

k= 1 u= 1 


( 10 . 8 ) 
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where fi is the parameter vector determined according to the assumption on 
the signals, and l(r(k, u )\/*,) is the LLF of the observation vector r (k,u) 
parameterized by fi . 

In case of deterministic signals, the LLF of r (k, u) in Equation (110.71) is 


1 2 

t (r(k, u)\ fi) = - r(k,u) — A(u)Bs(k,u) — ML\n(Ttr] u ) 

where the parameter vector is defined as 

T i '/' -t T' ~ 


(10.9) 


with 


^[pCbCs'V] 


p=[p[,...,p£] r 

b^[b[,...,b£f 

- A r-T 


( 10 . 10 ) 


( 10 . 11 ) 


s= [s r (l,l),...,s t (K,J)] 


t] = [rn, ...,r]j] T 

If the signal waveforms are known, the vector s is omitted from /jl. Note that, if 
the signals are unknown, the size of il increases with the data, which may result 
in inefficient estimation. 

If the signals are assumed to be random zero-mean Gaussian processes, then 
the LLF of r (k, u) in Equation (110.71) is 


i (r(k, u)\ fi) = — In |Rf (m)| — tr{Rf (t/)R f l (u)} — ML In (jr) 


( 10 . 12 ) 


where 


Rr(«) =E{r(k,u)r H (k,u)} = A(u)BRs(u)B n A n (u) + ri u I (10.13) 

K 


>H a H 


Rf (m) = 7 EVc u)r H (k, u) 
K 

k=\ 


R s(u) = £{s(k, u)s H (k, u)} 
and the parameter vector is defined as 


(10.14) 


(10.15) 


/i=[ P r ,b T yy] 


(10.16) 


where the vector 0 represents the independent parameters of R§(w) for all 
values of u. 


10.2.2 Uniqueness Conditions 

In this section we focus on stochastic signals. We determine conditions that 
ensure unique estimation of the emitter’s parameters. The conditions identify 
the maximum number of uniquely resolvable emitters, known in the literature 
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as resolution capacity (RC). We consider two cases: no prior knowledge of the 
correlation between signals, and uncorrelated signals. We derive two types of 
conditions for uniqueness. The first is a necessary condition based on the require¬ 
ment that the number of unknown parameters will be equal to or smaller than the 
number of independent equations. The second guarantees that the observation 
covariance matrix is unique for almost any parameter vector. 

Necessary Condition for Uniqueness 

A necessary condition for unique estimation stems from the requirement that the 
number of unknown real parameters not be larger than the number of independent 
equations. In mathematical notation the necessary condition is given as 


dim(p) +dim(b) +dim({R§(w)}) +dim(/y) < dim({Rf (u)}) (10.17) 


where dim(x) is the number of real variables needed to describe x. 

Consider first the unknown’s dimensionality. Assuming planar geometry, it 
is clear that dim(p) = 2 Q. Note that only the product of the signal vector and 
the attenuation coefficient is observed. Thus, a unique estimation of the signal 
vector and the attenuation coefficient requires some constraints on the attenuation 
vector. Without loss of generality, we assume that b q ,1 = 1, and therefore for Q 
sources dim(b) = 2 Q(L — 1). Also, since Rn(w) = rj u I, it is clear that dim (rj) =J. 
It can further be shown that 


dim({R § («)}) 


JQ 2 , General case 

JQ , Uncorrelated signals 


(10.18) 


Consider now the dimensionality of Equation (110.171) . Define a baseline vec¬ 
tor as the vector from one sensor to another. Define by Y the number of distinct 
baseline vectors in an array. The minimal number of baseline vectors is associated 
with a uniform linear array (ULA). A baseline of length 0 is also counted. The 
maximal number of baseline vectors is associated with a random array. Given 
that the number of sensors in the array is M, Y is related to the array geometry 
and is restricted to the range 


M(M — 1) 

M<Y<— -- + 1 


(10.19) 


It is shown in lAmar and Weis si (120071) that given Rf (u) with a rank r = 
min{W, LM} for all u, then 


dim({Rf (m)}) = 


J(2LMr — r 2 ), 


General case 


min{/(2LMr — r 2 ), Jcp(Y)}, Uncorrelated signals 


( 10 . 20 ) 


where 


<p(Y) = L(2Y - 1 + (L - 1 )M 2 ) 


(10.21) 
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Substituting the previous results in Equation (110.171) yields in the general case 


Q<l2LMr-r*-l + \-) -- 


J ) J 


( 10 . 22 ) 


The minimal and maximal values of the right side of Equation d 10.221) are obtained 
for r = 1 and r = LM , respectively. Assuming 7 > L, 


Q < V2 LM, r= 1 
Q<LM , r = LM 


(10.23) 


Thus, the maximum number of uniquely resolvable emitters is LM — 1, where 
LM is the total number of antenna elements in the system. Further, assuming 
r = LM and narrowband signals (7 = 1), we get from Equati on d 10.221) that Q < 
L(M — 1)—that is, L times the corresponding AOA bound (IWax and Ziskindl . 
19891) . As expected, more wideband signals than narrowband signals can be 
resolved. 

For uncorrelated signals we obtain, by substituting Equation dl0.20l) in 
Equation dl0.17l) . 


Q< 


J 


J + 2L 


(min{7(2LMr - r 2 ),7<p(F)} - 1) 


(10.24) 


It is shown in lAmar and Weissl (I2007h that the lowest upper bound is obtained for 
a ULA for which Y = M, and the highest upper bound is obtained for an arbitrary 
array for which Y = + 1. That is, 


Q<j^2l((LM) 2 -L(M- 1) 2 -1), ULA (Y = M) 

Q < jH^((LM) 2 — L{M — 1) — 1), Random array = 


_ M(M-l) 


+ 1 


) 


(10.25) 

Thus, a ULA is not recomm ended for achieving larg e RC. Furthermore, for ULA 
configuration it is shown in lAmar and Weiss ( 20071) that, for narrowband (7=1) 
and wideband (7 1) signals, we obtain from Equation dl0.24l) that the RC is 


Q < M-\-M 2 (L — l)/2, Narrowband signals 

Q<L (2 M — 1) + M 2 L(L — 1) — 1, Wideband signals 


(10.26) 


For narrowband signals this result is exactly M 2 {L — l)/2 above the correspond¬ 
ing AOA result; for wideband signals it should be compared with 2 M — 1, the 
corresponding AOA result. The difference is significant. 


Uniqueness with Probability One 

We derive expressions for the RC that guarantee a unique solution for every 
parameter vector, except for a set of parameter vectors with measure zero. This 
condition ensures that, except for a set of vectors with measure zero, for any 
parameter vector ijl that is associated with the true covariance matrix, there is no 
other parameter vector fi' that corresponds to an identical covariance matrix. 
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Formally define the “legitimate set” and the “ambiguity set” as all Hermitian 
matrices satisfying 

T = {R (u ): R (u) =Rf (u\ fi), u= 1,...,/} (10.27) 

Q = |R(t/) :R (u) = Rf (u; fi) =Rf(t/; ; fi 7 ^/*/, u = 1 ,...,/} (10.28) 


For this condition to hold, the dimensionality of the legitimate set must be strictly 
smaller than the dimensionality of the ambiguity set. Using the results presented 
in the previous subsection, it is clear that 


dim (JF) = 


2 Q + 2 Q (L — 1) + J\Q 2 + J , 
2 Q + 2 Q (L— 1) + /<2 + /, 


General case 
Uncorrelated signals 


To determine the dimensionality of £/, we first define 


H (u\ fi, /jl ) = Rf(w; /x) — Rf (u\ fi ) 


(10.29) 


(10.30) 


The set can thus be defined by 

Q = {H(w; /i, /i r ): H(w; /jl , /i r ) = 0; /jl /jl ! , u = 1,..., /} (10.31) 

The dimension of set £/ is determined by the dimension of {H(u; /jl, /jl')} minus 
the number of constraints associated with the equation {H (u\ fi, fi') = 0; /jl ^ /jl'}. 
The dimension of {H (it, ft, /*/)} is 


dim({H(w; fi, /*/)}) = 


2(2Q + 2Q(L-D+JQ 2 +J), 
2{2Q + 2Q{L-\)+JQ+J), 


General case 
Uncorrelated signals 

(10.32) 


The number of constraints imposed by {H (u; /jl, fi') = 0} is J(ML) 2 in the general 
case. For uncorrelated signals there is a functional dependency between the 
entries of Rf (u\ /x), and in this case the number of constraints is J(p(Y). Thus, 


dim(U) = 


2(2Q + 2Q(L— l) +JQ 2 +J) —J(LM) 2 , 
2(2Q + 2Q(L-l)+JQ + J)-Jcp(Y), 


General case 
Uncorrelated signals 

(10.33) 


The uniqueness condition holds if Equation (110.291) is smaller than Equation 
(110.331) . That is, 


Q < sj(LM) 2 + (L/J) 2 — 1 - ’ 7 , General case 
Q < j^ 2 l (^(^) — 1), Uncorrelated signals 

These bounds are similar to the necessary bounds without the equalities. 


(10.34) 


Numerical Examples 

We check the upper bounds by examining the singularity of the Fisher informa¬ 
tion matrix (FIM). Normally, when the FIM is singular it is impossible to get a 
reasonable estimate. Thus, we expect the FIM to become singular if the number 
of sources is too large for obtaining reasonable location estimates. We should 
emphasize here that we know of certain estimation problems in which the FIM 
becomes singular and it is still possible to get well-behaved estimates. 
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FIGURE 10.4 Largest number of resolvable emitters in the general case: (a) L = 2 and (b) L = 3. 


In Figure ITQT41 we plotted the upper bound on the RC as given in Equation 
(110.221) for the general case, and fo r the largest number of emitters for which the 
FIM (IWeiss and Friedlanderill993l) does not become singular, versus the number 
of frequency bins. We examined the cases of L = 2 with M — 2,3,4, and L = 3 
with M = 2, 3,4. In each case the station positions were randomly chosen and 
the arrays were ULAs. As can be seen, the analytical upper bound on the RC 
agrees for any given value of M, L, J with the FIM singularity. 

We compare the analytic upper bounds for Y = M (corresponding to a ULA) 
and the upper bound for Y = M(M — l)/2 +l (corresponding to a n arbitrary 
array) with the FIM singularity condition ( Weiss and Amaii 2005 ) assuming 
uncorrelated signals. We use three stations, each equipped with an array of three 
elements. As can be seen in Figure fT031 the upper bound related to an arbitrary 
array is higher than the bound related to a ULA. For / = 1 (narrowband signals) 
we see that the upper bound for an arbitrary array is larger than the upper bound 
for a ULA by a single emitter. However, for J 1 (wideband signals) the gap 
between the two upper bounds for a large number of frequency bins is increased. 


10.2.3 DPD for Unknown Transmitted Waveforms 

We discuss potential location algorithms for the common case where the sig¬ 
nal waveforms are unknown to the receivers. We start by assuming that the 
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Number of Frequency Bins (J) 

FIGURE 10.5 Comparison of the upper bounds on the number of resolvable emitters for 
uncorrelated signals with Y = M and Y = M(M — l)/2 +1. 


signals are random, and we apply a Multiple-Signals Classification (MUSIC) 
approach. 


MUSIC Approach 

To avoid a multidimensional search for the multisou rce case, we follow the steps 
leading to the MUSIC algorithm ( Schmidt . 1 1986b . We assume that the noise 
spectral level is identical at all frequencies (flat). Recall the definition of the 
covariance matrix of the intercepted signals in Equation (110.131) . The qth column 
vector of A(w)B is A q (u)b q . The column vectors of A(u)B are orthogonal to the 
noise subspace and are contained in the signal subspace. Define the matrix Ap (u), 
which has the same structure as A q (u) for a source located at some point p. We 
also use b as a general vector having the structure of b q . 

/V 

Let U 5 (w) be the MLxg matrix consisting of the eigenvectors of Rf(w) 
corresponding to the Q lar gest eigenvalues. Follow ing the MUSIC algorithm we 
propose the cost function ( Weiss and Amar . 12005 ) 


/ 


F( p, b) = £(Ap(«)b) H U J («)Uf («)Ap(«)b 


U— 1 


= b 


H 


J 


£A?(«)U s (M)U?( M )Ap(«) 


_M= 1 


b 


(10.35) 
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Note that, to facilitate a unique solution, we assume that the norm of b is one. 

In the sequ el we use the Rayleigh-Ritz theorem for maximization of a 
quadratic form (lRaoU2QQ2l) . which we display next for easy reference. 


Theorem 10.1. Given an N x N matrix A and an N x 1 vector x, define the Rayleigh 

quotient R(x) = x x h* . The vector x that maximizes R(x) is the eigenvector of A that 
corresponds to the largest eigenvalue of A. The maximum of R(x) is then A max {A}. 

Using Theorem llO.ll we obtain that, for any assumed position p, the maximum 
of L(p,b) corresponds to the maximal eigenvalue of the LxL matrix D(p) 
defined by 

j 

D(p) = T2 Af («)U s (M)Uf («)Ap(«) (10.36) 

U= 1 


Therefore, Equation (110.351) reduces to 


F( p)=A max {D(p)} 


(10.37) 


The matrix D(p) is a function of the observed data (i.e., U s (u)) and the array 
response at each base station to an emitter located at p. It is clear that the maxi¬ 
mization of Equation dlO.371) requires only a 2D search for emitters confined to 
a plane or a three-dimensional (3D) search in a general setting. This is a DPD 
approach. It is interesting to note that the dimensions of the matrix D(p) are 
LxL and are usually rather small. 


Numerical Examples 

We compared the performance of the DPD method with the traditional AOA 
approach by Monte Carlo (MC) simulations. The performance evaluation was 
based on the position root mean-square error (RMSE) defined as 


RMSE = 


N 


j Rexp 

— IIp °(0 — P°ll 2 

"exp i=l 


(10.38) 


where N exp is the number of MC trials, and po(0 is the estimated emitter position 
at the ith trial. We performed 100 MC trials in order to obtain the statistical 
properties of the performance. 

Consider three base stations placed at coordinates (2, —2), (2,0), (2,2) (in km) 
and two emitters placed at (0, +1.5) and (0, —1.5) (in km). Each transmitted 
waveform is a carrier amplitude modulated by a narrowband random Gaussian 
waveform. The signal is unknown to the receivers. Each base station is equipped 
with a ULA of only three antenna elements. Each location determination is based 
on 200 snapshots, each of 4.5 msec at a single frequency (K = 200, / = 1). The 
snapshot length ensures that the errors introduced by the finite-length fast Fourier 
transform (FFT) are 30 dB below the signal level. The SNR (at the base station 
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SNR (dB) 

FIGURE 10.6 RMSE of DPD and traditional AOA, CRLB, and PA results for three base stations, 
two sources, and unknown waveforms. 


receiving the strongest signal) is varied between 3 and 23 dB. The attenuation 
vector is selected as b = [1,0.8,0.4] r . 

The results for each of the sources are shown in Figure 1TQT61 Also plotted is 
the theoretical performance a nalysis (PA) obtained b y small error analysis of the 
DPD method. (For details see l Weiss and Amai . 12005 0 Notice that the theoretical 
results almost coincide with the CRLB. It is again clear that DPD outperforms 
AOA at low SNR, whereas the two methods are equivalent at high SNR. Consider 
now that three transmitters are placed at (0, 1.5), (0, —1.5), (—1, 0) (in km). 
Each base station collects 1000 snapshots, and attenuation is equal at all base 
stations. Since each base station is equipped with an array of only three elements, 
traditional AOA based on MUSIC fails. However, DPD works well, as shown in 
Figure fT077l 


Iterative Maximum Likelihood Approach 

We now propose a solution for deterministic signals. The maximum likelihood 
estimation (MLE) of the parameter vector given by the observation model in 
Equation (1 10.9b . assuming the known and flat noise spectral level, is 


/jl = argmin 

IL 


K J 



r(k, u) — A(u)Bs(k, u) 


. k= 1 u= 1 


(10.39) 
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SNR (dB) 

FIGU RE 10.7 RMSE of DPD, CRLB, and traditional AOA, CRLB, and PA results for three base 
stations, three sources, and unknown waveforms. 


To reduce the multidimensional search of the MLE, we resort to an iterative 


least-s 

auares (LS) aleorithm bv modifvine the techniaue nresented in Weiss 

et al. ( 

1988 

) to the geolocation model. The proposed Decoupled Geolocation 


ML (DGML) algorithm estimates the positions of Q emitters in a plane by a 
2D search in each iteration instead of by a multidimensional search over all 
positions. 

Assume we have an initial rough estimate of the vectors {p^,b^}^ =1 . As 
in many iterative algorithms, the convergence to the global minimum of the 
cost function depends on the initial estimate. Such an initial estimate can be 

— /V 

provided by the DPD method previously presented. Let A (u), B be the matrices 
A (u), B computed with the initial estimate of {p^,b^}^ =1 . Now we minimize 
Equation (110.391) w.r.t. s(k, u)\ 

s (ifc, u) = [(A(m)B)^A(m)B] _ 1 (A(M)B) H f(ifc, u) (10.40) 

Define the Q x Q diagonal matrix, E q , where the entries on the main diagonal 
are all 1 except for the (q, q )-th entry, and let all other entries equal 0. Define the 
vector, 

s (<?) (k,u) = EqS(k, u) (10.41) 

Also define a vector associated with the gth emitter, 

Wg (k,u)= r (k, u) — A(m)Bs^ (k, u) 


(10.42) 
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Assume that the positions and attenuations of all emitters, except those of the 
gth emitter, are equal to their true values. In such a case, using the definitions 
in Equation dlQ.6b . Equation (110.421) reduces to 

W q (k, u)=A q (u)b q s q (k, u)+h(k, u) (10.43) 

This is the qib column vector of A(w)B multiplied by the qtb signal s q (k, u ), and 
added with the noise. This motivates a decoupled search. Instead of minimizing 
Equation (110.391) we minimize 


K J 


F = 



k= 1 U —1 


— A 

W q (k, u)-A q (u)b q s q (k, u) 


(10.44) 


where s q (k, u) is the gth element of s(k, u). Minimizing Equation (110.441) w.r.t. 

/V 

s q (k, u) yields 

Sq(k, U) = [(. Xq(u)b q ) H A. q (u)b q ]~ l (A q (u)bq) H Wq(k, u) (10.45) 

By substituting Equation (110.451) in Equation (110.441) we see that the estimate 
of pg and b q is given by 


/V 

(P< 7 >b 9 ) = argmax • 

K J 

hm- 2 £E 

b^A^(w)w ? (Cw) 

> 

2 

P q^q 

k= 1 ii= 1 


> 


(10.46) 


Recall that we assume that ||b^|| 2 = l. Define the sample correlation matrix 
associated with the gth signal and the uib Fourier coefficient: 


W*(m) = 


1 

K 


K 

y^w q (k,u)yf^(k,u) 

k =1 


and theLxL Hermitian matrix: 


(10.47) 


J 

D <? = E A ? (m A ? ( U)k q (u) (10.48) 

U= 1 

Substituting Equation (110.481) in Equation (110.461) yields 

(P ? ,b ? ) = argmax{b^D (? b ? } (10.49) 

P q-^q 


Using Theorem llO.ll we obtain that, for any given emitter position, the vector 
b q that maximizes the expression is equal to the eigenvector u max correspond- 

/V 

ing to the largest eigenvalue of D q —that is, b q =u max (D^) —and the estimated 
position is 


P? = argmax A max {D ? } (10.50) 

P q 

Therefore, the maximum of the cost function is found by performing a 2D grid 
search over the area of interest. For each grid point the largest eigenvalue of 

































Chapter 


Direct Position Determination 


is the local value of the cost function. The grid point associated with the largest 
value of the cost function is the temporary estimate of the qth source position. 

Numerical Examples 

We demonstrate the performance of the proposed iterative algorithm via MC 
simulations. In all simulations (unless stated otherwise) we used the following 
parameters. Four base stations (L = 4), located at the corners of a square at posi¬ 
tions (X\, Y\) = [0,0], (X 2 ,y 2 ) = [l,0], (X 3 ,Y 3 ) = [l,ll and (X 4 , Y 4 ) = [0,1], 
with all units in kilometers. The rotation angles of the arrays, measured between 
the array baseline and the west-east axis, denoted by 0£, are 0i = 135°, 02 = 45°, 
03 = 135°, and 04=45°. The signals and the noise are random, complex 
Gaussian vectors with covariance matrices o 2 I and a 2 1, respectively. We assume 
the signals are narrowband {J = 1). We generated 50 independent Fourier coef¬ 
ficients (K = 50) of the signals and noise processes. Our definition of the SNR 

is SNR[dZ?] = 101og 10 (a 2 ter). The initial emitter positions were set randomly 

as P? 0) =P? + l 00 x^, <7=1,... , Q , where i; q is a 2 x 1 Gaussian vector with 
zero mean and covariance equal to the identity matrix. The initial values of the 
attenuations were chosen randomly. We performed a coarse search with 10-m 
resolution over the full area of interest and a fine search with 1-m resolution near 
the peaks found in the course search. The convergence threshold of the algorithm 
was set to s = 1 m. 

In the following examples we consider several values of SNR. For each SNR 
we performed 50 independent trials. In each the initial positions and initial atten¬ 
uations were randomly chosen. We then estimated the positions and attenuations 
of the emitters in each iteration step (the number of iteration steps in each trial 
was not necessarily equal). We evaluated the performance using the following 
RMSE for each iteration step: 

RMSE ? (0 = if; J(x n , q (i) -x q ) 2 + (y n , q (i) -y q )\ *=1./ 

n= 1 V 


where (x n , q (i),y n , q (i)) is the estimated position of the qth emitter at the nth 
experiment and at the ith iteration step; I is the maximum number of iterations in 
all trials; and N is the number of experiments with at least i iterations. Denote by 
N exp the number of experiments. As indicated previously, N exp = 50. The value 
of N/N exp (in percent) is shown in Figures 10.8 and 10.10 

Assume now that we have two emitters (Q = 2) when each of the stations 
is equipped with a uniform circular array (UCA). Each array consists of five 
elements—four equally distributed on a circle with radius A/2 and the fifth 
(the reference) located at the circle center, where A is the wavelength of the 
transmitted signals. In Figures fl0.8l we show the RMSE of the first emitter using 
circular arrays in all stations. Each array consists of five elements—four equally 
distributed on a circle with radius A/2 and the fifth (the reference) located at 
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SNR = 20 dB 



the circle center. The SNR is 20 dB. The numbers indicate N/N exp in percent. 
As can be seen, after only four iterations the CRLB is reached. 

Also assume we have two emitters (Q = 2 ). A ULA with three sensors 
(M = 3) is located at each station. The spacing between adjacent sensors is A/2. 
In Figure [1(191 we show the RMSE of the first emitter versus the SNR. The SNR 
varies between —4dB and 4dB with a 1-dB step. The number of sections is 
K = 50. As expected, the RMSE converges to the CRLB as the SNR increases. 
Also, an SNR of —2dB is sufficient for approaching the CRLB. 

Finally, assume we have three emitters (Q = 3) and a ULA with three sensors 
(M = 3) located at each station. In Figure [TO. 101 we plot the RMSE of the three 
emitters for SNR = 20dB. As can be seen, after five iterations the CRLB is 
reached in 92% of the trials. Only in 2% of the cases are 10 iterations required 
for convergence. 


10.2.4 DPD for Known Transmitted Waveforms 

In certain applications the transmitted waveforms are known to the location 
system. In cellular systems, for example, synchronization and training sequences 
are transmitted periodically and are known a priori. Moreover, it is possible to 
detect the data sequence of a digitally modulated signal and then restore the 
complex signal envelope based on the known modulation scheme. 
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First Emitter 



Second Emitter 



Third Emitter 



FIGURE 10.10 Convergence of the algorithm for three emitters. 
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In lAmar and Weissl (120061) it is shown that the LLF in Equation (110.91) is given 
under the assumption of a known flat noise spectral level as 


F\ =tr 


J 

S[A(«)B-C(«) 

U— 1 


lHr A(u)B-C (u) 


Rss(^) 


(10.51) 


where 


and 


c(«)=r|(«)Rss(m) 


i K 

Rss(m) = — Vs(i:, u)s H (k, u) 

K 

k= 1 

1 K 

Rsf (m) = — Tm, u)r H (k, u ) 

K z ' 


(10.52) 


(10.53) 


(10.54) 


k= 1 


Assuming that the signals are uncorrelated, R§s(^) is asymptotically diagonal. 
Therefore, the cost function F\ can be written as 


Q 


Fi=J2 F i(q) 

q= 1 


(10.55) 


where F\(q) is the cost function associated with the gth emitter. That is, the 
estimation of the position and attenuations of each emitter is decoupled from the 
estimation of the positions and attenuations of the others. 

Define the matrix Ap (u), which has the same structure as A q (u) for a source 
located at some point p. We also use b as a general vector having the structure b^. 
It can be shown that the cost function F\ ( q ) is given by 


J 


^) = £lM«)-Ap (M) b|i : 


(10.56) 


11= 1 


where c q (u) represents the ^th column of C (u). The vector b that minimizes the 
cost function is given by 


b = 


j 


-,-1 


y^A^(M)Ap(w) 


— U— 1 


J 




(10.57) 


11= 1 


Substituting Equation (110.571) in Equation (110.561) . we get a cost function that 
depends on p only. It is then easy to verify that minimizing Equation (110.561) is 
equivalent to maximizing 


Fi(q) = 


J 


(u)c q (u) 


11= 1 


L 


=£ 

i =l 


j 


y j e^ u Tl ^ af (p) (u) 


u= 1 


(10.58) 
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where c f\u) is the fth sub vector of c q (u) associated with the fth base station. 
The estimated position of the gth emitter is given as 


p = argmax F\ (q) (10.59) 

P 


The estimation of each Q emitter is performed as described in Equation (110.59 ) 
by substituting the associated vector (u), q = 1,..., Q in Equation (110.58 ). 

Note that Equation (110.581) indicates that F\ (< q ) is a sum of L distinct cost 
functions, each associated with a distinct base station. This is the reason that 
DPD outperforms methods that maximize the cost function at each base station 
independently. 

A n important aspect of an algorithm is its computation load. In Amar and 
Weiss (120061 . Appendix H), it is shown that the ratio between the total number 
of operations required for AO A localization and for DPD localization is 


(AOA) op = ^ 2_ 

(DPD) 0/ , MQ > M J 


(10.60) 


As the number of sources increases, this ratio increases exponentially, due to 
the complexity involved in the “angle source” association required in AOA 
localization of several sources simultaneously. However, for a single emitter 
no association is required and then the number of AOA operations is fewer than 
the number of DPD operations. 


Model Errors Analysis 

Errors in DPD estimates can arise from no LOS propagation, array calibration 
errors, sensor-positioning errors, and so forth. For example, in practice the as¬ 
sumption that sensor locations are known precisely does not always hold. Sensors 
mounted on moving platforms such as cars, ships, and airplanes are prone to loca¬ 
tion errors. In lAmarandWeissl(l2QQ6t) we discussed imprecise knowledge of sensor 
locations and scattering environment. Here we consider the former case only. 

Assume that L e out of L arrays suffer from position errors and in each array 
only M e of M elements is displaced. Assume that sensor p osition errors are 
Gauss ian random variables with zero mean and variance s 2 . In lAmar and Weiss 
(120061) it is shown that the effect of sensor position errors is approximately on 
the same order as the effect of observation noise if 


where 



2 

-~ In 

(2 7t/k) 2 



2 A cr 2 

Y = —^- 

^[Rss]g,g 


(10.61) 


(10.62) 


which can be interpreted as the SNR. 
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stations, each equipped with a 5-element circular array, are used. 


Numerical Examples 

We compare the performance of the DPD method with that of the traditional AO A 
approach for known signals. Each location determination is based on a specified 
number, K , of 4.5-msec snapshots using a single-frequency bin (/=1). The 
snapshot length ensures that the errors introduced by the finite-length FFT are 
30 dB below the signal level. The transmitted waveforms are realizations of a 
normal Gaussian in random process. 

We used four base stations located at (—2, —2), (—2, +2), (2, —2), (2, +2) 
(in km) and a single source at (1, 1) (in km). Each base station is equipped with a 
UCA of five elements. The waveforms are known, and the number of snapshots is 
1000. The attenuation to two base stations is 0 dB; to the other two, —10 dB. The 
accuracy results are plotted in Figure fTOJTI The plots indicate that DPD is superior 
to the traditional approach of independent AO A estimates at each base station. 

We also consider displaced sensors. Each of the four stations is equipped with 
a ULA of 11 elements. Only the array elements of the first station are displaced. 
The standard deviation of the location error is e. We discuss the performance of 
DPD and AO A as a function of s/X and the number of displaced elements, M e , 
with that number changed between 1 and 11. The RMSE of both DPD and AOA 
is shown in Figure [lQ.121 where we plot DPD and AOA error as a function of the 
standard deviation for three cases: 1, 6, and 11 displaced elements. Note that for 
a small number of displaced elements, the difference between DPD and AOA is 
small. However, as that number increases, so does the difference. 
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FIGURE 10.12 RMSE of AOA and DPD for (a) 1, (b) 6, and (c) 11 displaced elements. 


10.3 LOCALIZATION FOR NONSTATIONARY GEOMETRY 


The use of the Doppler effect for localization is seen in radar, sonar, and satellites. 
Here, focus on locating a stationary radio emitter by moving receivers. The 
motion induces a frequency shift that is proportional to the signal frequency and 
to the radial velocity of each receiver toward (or away from) the emitter. It is 
assumed that receiver location and velocity are known and therefore that emitter 
location can be estimated. 

One of the localization methods discussed in the literature is differential 
Doppler (DD), also known as frequency difference of arrival, which consists of 
measuring frequency differences between receivers since it eliminates the need 
to know the exact transmit frequency. DD can be obtained b v estimating the 
frequency at each receiver and then computing the difference ( Chestnutlll982 . 
Section III) or by directly measuring t 
correlation (iKnapp and Carter , 1977 


le frequency difference using signal cross- 




SteinL 119931) . The latter approach requires 


the transfer of raw data to the LPU and therefore is associated with higher trans¬ 
mission rates compared with the former approach. A dat a compression technique 
was recently proposed to reduce the transmission load ( Fowler . l200(lh . 

Differential Doppler w as proposed for locating a moving emitter using sta- 
tionary recei yers in sonar (Schultheiss and Weinstein . 1979 : Weinstein . 1 1982 ) 
and artillery ( Weinstein and Levanon . ll980b applications. A similar problem of 
































































































































































Localization for Nonstationary Geometry 


localizing a moving tone source was discussed in Chan and Jardine ( 1990h and 


Chan and Towersl dl992alibh without employing DD. 


A system for localization of a stationary emitter can be implemented by at 
least a single platform carrying at least two re ceivers. Severa l platforms, each 
carrying a single receiver, are more common. I Beckon (119921) investigated the 
positioning error of a radar transmitting a pure, stable, unknown tone based on 
multiple measurements o f frequency and bearing taken along a single platform 
trajectory. iBecker ( 19991) extended his previous contribution by considering the 
drift and frequency hopping of t he transmitted signal. Considering measurements 
collected by a single receiver, Fowler ( 12001 ) investigated the accuracy of 3D 
localization using terrain data. Levanon ( 19891) discussed a DD location system 
based on two receivers on a single platform and compared its performance with 
interferometer measurements. 

The Doppler effect has also bee n used for stationary emitte r geolocation 
by satellites. T he SARSAT/COSPAS ( Scales and Swanson . 1984 ) and ARGOS 
( Bessisl.ll98ll) systems determine the emitter’s position using a single satellite 
receiver. The satellite relays the observed signal to an Earth station where the 
instance of zero Doppler shift is determined. Zero Doppler shift is associated 
with the point where the satellite is closest to the emitter, also known as the point 
of closest approach (PCA). The emitter’s position is then determined from the 
PCA and the known satellite location and vel ocity. A n error analysis of these 


systems was presented by lLevanon and Zakenl (119851) 


An interesting and related application of localization base d on the Dopple r 
effect was demonstrated by the U.S. Navy TRANSIT system ( Levanon[|l980b . 
The localization of a stati onary emitter with multiple moving platforms was dis¬ 


cussed bv lHaworth et al.l (119971) . who presente d a system for localiz ing satellite 


interference sources based on DD and TDOA. Ho and Chan dl997 ) considered 
the same system and prov ided an analytical solution for source location. Later, 
Pattison and Chou ( 2QQ0b examined the effect of satellite position and velocity 


errors on this solution. 

All of the just mentioned approaches use two steps for localization. In the 
first step the Doppler frequency shifts, or their differences, are estimated without 
the constraint that all estimates must correspond to the same emitter location and 
the same transmitted frequency. Only in the second step is the location estimated 
based on the results of the first step. For this reason, these methods are not 
guaranteed to yield optimal localization results. The objective of this section is 
to apply the ideas of the DPD approach to this location problem. 


10.3.1 Problem Formulation 

Consider a stationary narrowband radio emitter and L moving receivers. The 
receivers are assumed to be synchronized in frequency and time. The emitter’s 
position is denoted by the vector of coordinates po. Each receiver intercepts 
the transmitted signal at K short intervals along its trajectory. Let pand 
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where k = 1,..., K and 1 = 1,... ,L, denote the position and velocity vectors of 
the i th receiver at the kth interception interval, respectively. 

The complex signal observed by the fth receiver at the kth interception 
interval at time t is 


n,k( t ) = bi,kSk(t)e J2xft ' kt +wej(t), 0 <t<T (10.63) 

where T is the observation time interval; bi^ is an unknown complex scalar 
representing the path attenuation at the kth interception interval observed by 
the i -th receiver; Sk(t) is th e observed narrowband signal envelope during the 
kth interception interval, which may be known or unknown depending on the 
application; is a white zero-mean complex Gaussian noise; and is 

the frequency observed by the i th receiver during the kth interception interval 
given by 


fi,k — \fc + v k][ 1 (Po)] 

, , a i v[*[po-Pul 

M,k( Po) =-77-— 

c llpo-pall 


(10.64) 

(10.65) 


where f c is the nominal frequency of the transmitted signal, assumed known; 
is the unknown transmitted frequency shift due to source instability during the 
kth interception interval; and c is the signal’s propagation speed. Since \±i^ 1 

and Vfc <<C f c > Equation (110.641) can be approximated as 


fi,k = v k +/c[l + M,k( po)] (10.66) 

where the term VkHi,k, which is negligible w.r.t. all other terms, is omitted. 

Assume that each receiver performs a down-conversion of the intercepted 
signal by f c . Thus, Equation (110.641 is replaced by 


fi,k =fi,k ~fc = Vk +fc^i,k( Po) (10.67) 

The transmitted frequency is assumed to be constant during the interception 
interval, T. The down-converted signal is sampled at times t n = nT s , where 
n = 0,..., N — 1 and T s = T/(N — 1). Denote the sampled signal at the kth inter¬ 
ception interval by = ri k(nT s ). Then Equation (110.631 can be written in 

vector form as 


r e,k = bi^i,kCkSk + w i,k 


(10.68) 
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where 


r i,k = fc*[0], • • •, ri,k\N — 1]] T 

(10.69) 

w e t k = [w£,a0], ...,we,k[N- l]] r 

(10.70) 

s u = [■*■*[0], ...,s k [N- l]] r 

(10.71) 


(10.72) 

C k = diag{ 1, e i2nVkTs e i^ v k(N-i)T s j 

(10.73) 


Note that the preceding equations are accurate only if all receivers are 
synchronized in frequency and time. Moreover, the signal complex envelope 
Sk is the same at all spatially separated receivers provided that the signal band¬ 
width (rate of signal change) is small compared to the inverse of the propagation 
time between receivers. This places a restriction on the receiver spatial separation 
for a given signal bandwidth. 

The problem can be briefly stated as follows: Given the observation vectors 
{r^} in Equation (110.681) . estimate the position of the emitter. 


10.3.2 Differential Doppler Localization Method 

To simplify our description of the DD positioning approach, we assume two 

receivers. Denote by Af k the estimated frequency difference between the first 
receiver and the second receiver during the kth interception interval. For example, 
the MLE of the frequency difference / between two seq uences n.k.\n ], 
of length N can be obtained by converting Equation 14 in 
discrete time domain: 


Steinl (119931) into the 


Af k = argmax 

/ 


N-\ 

T, >'\ ,k\ n \ r 2.k\ n \ ej2 ^ n 


n— 0 


(10.74) 


In our case, r\^[n] represents the sampled output of a receiver at the the kt h 
interception interval, and r 2 ,^[n] may represent either the sampled output of a 
second receiver or samples of the known waveform. 

The frequency difference is used to eliminate the unknown transmitted fre¬ 
quency offset v k . Using Equation dlO.671) the frequency difference associated 
with the kth interception interval is given by 

Afk =f c Amk(po) + *k (10.75) 


where 

Am^(po) = /u,fc(Po) - M2,£(po) 


(10.76) 
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and €k is an error reflecting the measurement errors and all other model errors. 
Using the LS principle, the position estimator is given by 

K 

Po = argmin V] | Af k -f c Am k (p) | :2 (10.77) 

p k=\ 

Note that for accurate DD estimates the receivers should be frequency- 
synchronized with high precision. Otherwise, these estimates will affect the 
measurements. 


10.3.3 DPD Localization Approach 

Consider the observation vectors in Equation (110.681) . The information on the 
emitter’s position is embedded in each of the matrices A^k- This position is 
common to all observations at all interception intervals, so we estimate it as 
the position that best explains all the data together. Because of its excellent 
asymptotic properties (consistency and efficiency), we focus on the MLE. 

The LLF of the observation vectors is given (up to an additive constant) by 


1 


K L 


L\ =- W r t,k - bl,hAi,kCkSk \\ 2 

° n k= 1 1 =l 

The path attenuation scalars that maximize Equation (110.781) are 

i>i,k = [(Ai,kCk$k) H Ai,kCk$k]~ l (Ai,kCkSk) H *i,k 
= (Ai,kCkSk) H *i,k 


(10.78) 


(10.79) 


where we assume, without loss of generality, that |||| 2 = 1 and use the special 
structure of A^k and C^. Substitution of Equation (110.791) in Equation (110.781) 
yields 

K L 

' \re,k\\ 2 -\(M,kC k s k ) H r ^ k \ 2 


1 


£l = —2 

(7 Z 

n 



Lk= 1 1 =1 


(10.80) 


Since ||rg^ || 2 is independent of the parameters, instead of maximizing Equation 
(110.801) we can now maximize the cost function L 2 given by 

K L K 

L 2 = ^ ^2 1 (Ai'kCkSk) H rt,k\ 2 = y^XCkSk^QkCk$k (10.81) 

k=\l=\ k= 1 

where we define the N xN Hermitian matrix as 


. tH 


Qk = \kV A k 

and where the N xL matrix is given as 

V ^[ A U r i^’-*-’ A U^] 


(10.82) 


(10.83) 
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We now consider the two cases of unknown and a priori known transmitted 
signals. 


Unknown Transmitted Signals 

Define the unknown N x 1 vector Uk = CkSk. The cost function in Equation 
(110.811) is maximized by maximizing each of the K quadratic forms w.r.t. u 
Thus, according to Theorem llO.ll the vector Uk should be selected as the eigen¬ 
vector corresponding to the largest eigenvalue of the matrix Q^. The cost function 
in Equation (110.811) therefore reduces to 

K 

= Q*} (10.84) 

k= 1 

The dimension of is N x N, and thus it increases with the number of data sam¬ 
ples. Determining the eigenvalues of can in turn result in high computation 
effort. Instead, define the LxL matrix 


Q*=vfv* 


(10.85) 


According to lRaol d2002l pp. 42-43) we see that the nonzero eigenvalues of 
and are identical. Therefore, recalling the definition in Equation (110.821) . 
can be replaced with Q^. This substantially reduces the computation load since 
L«A. 


The estimated emitter’s position po is determined by a grid search. For any 
grid point p in the position space, evaluate Equation (110.841) with replaced 
by and obtain 


K 

Lus( P) = E ^max {Q*} 

k= i 

The estimated emitter’s position is then given by 

p 0 = argmax{L^(p)} 

P 


( 10 . 86 ) 


(10.87) 


An algorithm that uses Equation (110.871) to estimate the emitter position is usually 
more precise than a two-step procedure. 


Known Transmitted Signals 

Define the vector c k = diag{C^}, the matrix = diagls^}, and the N xN matrix: 


g*=s?q*s* 

Now Equation (110.811) can be simplified: 

K 

L 2 = E^ 




k= 1 


( 10 . 88 ) 


(10.89) 






















412 


Direct Position Determination 


Equation dlO.891) is a polynomial in z = e ^ 2nvTs . Define by 


a n = l r diag{G*; n}, n = 0, ±1,..., ±(N— 1) 


(10.90) 


where diag(G^; n) is a column vector containing the nth diagonal elements of the 
matrix G k , ao is the sum of the main diagonal elements, and a^-i, a-(#-i) are 
the (1, A^)-th and (N, l)-th elements of G k , respectively. Since G^ is Hermitian, 


Oiyi - Oi 


* 

— 11 ' 


Let 


m 


It can be shown that 

N-\ 

$G k c k = 

ij= 1 


Oi 0, 

m = 0 



2ry 

m= 1,..., N 

-1 



N -1 


r N -1 

i-j — 

E a nZ m = 

m- 

E^ 

7 

t=-AH-l 


. 771=0 


(10.91) 


>* — m 
) mr 3 


(10.92) 


Thus, in order to find the v k that maximizes cj^G&c^, compute the FFT of the 

sequence {/3* z and take the real part of the result. The FFT length should 
satisfy M >N. If the largest FFT coefficient is the moth coefficient, then 


v k = 


mo 


mt s ’ < M/2 

m 0 >M/2 


(10.93) 


For each grid point p in the position space, evaluate the K matrices {G k } and use 
FFT to find the v k that maximizes the expression c^G^c k . The emitter position 
estimate po is determined as the position p that maximizes 


K 


Lies (f» = Y, max {cf G k c k } 


(10.94) 


k= 1 


This concludes the derivation of the proposed method for unknown and 
known signals. 


10.3.4 Computation Load 

We assess the computation load of the DD and DPD method, assuming for 
simplicity two receivers. 

Differential Doppler Approach 

Consider the MLE in Equation (110.741) . An efficient implementation can be 
obtained with a first step of computing the FFT of {n ^ [n] r | k [ n ]}. This involves N 
complex multiplications for obtaining {r\ k [n]}, N\og 2 N multiplications 
for computing the FFT, and finally N complex multiplications for obtaining the 
squared absolute value. Thus, along the track the total number of complex mul¬ 
tiplications is KN(2 + \og 2 N) = KNlog 2 N. This number will be doubled if the 
signals are known and the frequency is estimated for each of the two receivers 















Localization for Nonstationary Geometry 


separately. In the second step, the cost function in Equation (110.771) is evaluated. 
For each point in the grid, \AK real multiplications (or IK complex multipli¬ 
cations) are required. Thus, for N g grid points, the total number of operations 
is dominated by 7N g K + KN 1 og 2 N if the signals are unknown, and therefore 
cross-correlation is used, or lN g K + 2KN log 2 N if the signals are known. 

Proposed Approach 

For known signals, the estimated position in our approach is determined by the 
maximum of Equation (110.941) over all grid points. Note that where 

B& is an LxN matrix defined as Bk = \^Sk. The number of multiplications 
required to evaluate B^ is 2 NL since A k,i and are diagonal matrices. Since 
B^ is an LxN matrix, the number of multiplications required to evaluate the 
Hermitian matrix is approximately 0.5LN 2 . Therefore, we need a total of 
2NL + 0.5LN 2 multiplications to evaluate G&. 

The scalar c^G&c^ is evaluated by FFT as previously described. The FFT 
requires approximately N log 2 N multiplications. The total number of operations 
for N g grid points is therefore N g K(2NL-\-0.5LN 2 +Nlog 2 N) = 0.5N g KLN 2 . 
For unknown signals the evaluation of in Equation (110.831) requires NL multi¬ 
plications, and the evaluation of requires an approximately additional 0.5AL 2 
multiplications. Finding the eigenvalues in Equation (110.861) requires L 3 mul¬ 
tiplications. The total number of multiplications is N g K(L 3 +0.5NL 2 +NL) = 
N g KLN(0.5L+ 1). Thus, in most cases the proposed approach requires consid¬ 
erably more computation than the DD, even if L = 2. 


10.3.5 Numerical Examples 

We examined the performance of the proposed method and compared it with the 
DD method and with the CRFB (lAmar and WeissL 120081) using MC computer 
simulations. We focused on the position RMSE defined in Equation (110.381) . To 
obtain statistical results, we used 100 MC trials. 

The simulated signal was a 10-Kbps quadrature phase shift keying (QPSK) 
communication signal, sampled at 10 3 samples per second. Unless stated oth¬ 
erwise, we used 100 samples at each interception interval. The QPSK symbols 
were selected at random. The same signals were used for the known and unknown 
signal cases. The simulated nominal signal carrier frequency was f c = 0.1 GHz. 
The propagation speed was assumed to be c = 3 x 10 8 m/sec. The emitter’s posi¬ 
tion was chosen at random within a square area of 10 x 10 Km. The unknown 
transmitted frequency shifts, {v^}, were selected at random from the interval 
[—100,100] Hz. The channel attenuation was selected at random from a normal 
distribution with mean 1 and standard deviation 0.1, and the channel phase was 
selected at random from a uniform distribution over [— tv, tv). These parameters 
were then used for all trials. 

Consider two receivers (L = 2), where one is moving leftward and one is 
moving rightward. Unless stated otherwise, the receivers’ speed is v = 300 m/sec. 
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Each receiver intercepts the signal at ten different intervals (K = 10) along its 
trajectory. The first receiver intercepts the signal every 1 Km starting at [1,0] 
Km and finishing at [10,0] Km; the second receiver intercepts the signal with 
the same spacing starting at [10,10] Km and finishing at [1,10] Km. As a second 
configuration, consider that three receivers (L = 3) are simulated. The trajecto¬ 
ries, velocities, and interception points of the first two receivers are the same as 
previously described. The third receiver moves from [1, —0.2] Km to [10, —0.2] 
Km and intercepts the signal every 1 Km. 

The localization performance of DPD is compared with the performance 
of the DD for known and unknown signals. The RMSE of position estimation 
versus SNR for known and unknown signals, and for two and three receivers, 
is shown in Figure [TO. 131 In all cases DPD outperforms DD at low SNR but at 
high SNR the methods are equivalent. (SNR is defined as the ratio of the average 
transmitted signal power to the average noise power.) 

We now compare DD and DPD when the SNR is 20 dB for the two receivers 
at all interception intervals except for the SNR of the second receiver at the 
last three interception intervals. The direction and speed of each receiver is as 
previously stated. The SNR at the last three intervals is changed from 20 dB to 
—24 dB with a step of 2 dB. 




SNR (dB) 


SNR (dB) 




SNR (dB) SNR (dB) 

(a) (b) 

FIGURE 10.13 RMSE of DPD, DD, and CRLB versus SNR, with known and unknown trans¬ 
mitted signals, for two receivers (a) and three receivers (b). 
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FIGURE 10.14 RMSE of DPD and DD versus SNR of the second receiver at the last three 
interception intervals. 


Figure fTO. 14l shows the results. It can be seen that as the SNR at the last three 
intervals of the second receiver decreases, the performance of the two methods 
derogates. However, beyond a certain point the performance of DPD improves 
in contrast with conventional DD. DPD ignores the unreliable data and performs 
as if it does not exist. Practically, the DD outlier will probably be removed by 
a goodness-of-fit (chi-square) test. However, as demonstrated here DPD works 
well without such tests. 

We examine the RMSE versus the number of samples, N, used in each inter¬ 
ception interval. Consider two receivers (L = 2). Assume that the first receiver 
moves from [1,0] Km to [10,0] Km and intercepts the signal every 1 Km. The 
second receiver moves from [10,1] Km to [10,10] Km and intercepts the signal 
with the same spacing. The signals are assumed unknown to the receivers. The 
SNR is —5 dB. The number of samples is changed from N = 100 to N = 500 with 
a step size of 20. 

The position RMSE versus N for DD and DPD and the CRLB is shown 
in Figure ire). 151 For a small number of samples, DPD outperforms DD, but 
as the number of samples increases the methods become equivalent. Note that 
as N increases, the number of unknowns increases since the signal samples 
are unknown. This explains the gap between the RMSE and the CRLB, which 
does not decrease with increasing N. Therefore, neither method is statistically 
efficient. 
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FIGURE 10.1 5 RMSE of DPD, DD, and CRLB versus the number of samples N. 


10.4 TWO-STEP VERSUS ONE-STEP LOCALIZATION 

As expected intuitively and demonstrated in this chapter, two-step procedures 
usually provide inferior performance with respect to single-step schemes. This 
property has been proven in the literature under asymptotic conditions, so it is of 
interest to identify indirect methods, appropriate for sensor data fusion, that are 
at least as asymptotically efficient as the direct methods. We prove here that if 
ML based on incomplete data is used in the first step and weighted least squares 
is used in the second step, indirect estimation achieves the performance of direct 
estimation for asymptotically large data sets. 

10.4.1 Problem Formulation 

Consider a parameter vector 0 in a space 0, which is a subspace of the 
L-dimensional real Cartesian space. Similarly, consider a different parameter 
vector 0 in a space 4>, which is a subspace of the M-dimensional real Cartesian 
space where M <L. In short, 0 g 0 c M L , 0 G <I> C M m . Assume that for every 
0 G <I> there is a 0 g 0; the mapping is known and is given by some injective 
function 0 = g(0), V0 g <I>. This mapping defines a subspace 0 c 0 given as 
0 = {0:0 = g(0), 0 G O}. 

Assume now that each of the elements of the vector 0 is a parameter that 
affects the p.d.f. /(xy|0y) of an observation vector denoted by xy. Denote by 
Xj(n) the nth realization of xy. We assume that each of the observation vectors 
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xj , j = 1,..., L is collected at a different location and the locations are spatially 
separated. In other words, the estimation of 0j must be based on {xj(n)}^ =l , not 
on {x, (n))^ =1 , i^j- 

Based on a collection of N independent realizations {x ; (n)}jl 1 , we are inter- 
ested in estimating the primary parameter of interest 0. The parameter 0 can 
be estimated directly from all observation vectors (the complete data) using 
a single step, or it can be estimated in two steps by first estimating 0 and 
then using the estimate 0 for estimating 0. Since the j th element of 0 is esti¬ 
mated using only {Xj(n)}^ =l , it is clear that each element of 0 is estimated using 
incomplete data. 

Define the column vector x{n) consisting of a concatenation of the 

observation vectors x(n) = [xj (n), x 2 (n )... x L ( n)\ and the matrix X = 
[x(l),x(2).. .x(A)]. Under the assumption that the vectors {x(ft)}^ =1 are inde¬ 
pendent, the log likelihood of X is given by i (X|0) = Yln=i ^ ( X WI^), where 

£(x(«)|0) = Ef = j Uxj(n)\6j) and l(Xj(n)\0j) =ln(/(x ; -(n)|0,-)). 

Since 0 = g(0), the exact ML estimator of 0 is defined by 

0! = argmax i (X|g(0)) (10.95) 


This is a single-step scheme for estimating 0 from all observations. Note that 
the search is performed only within the known space <I> of the vector 0. 

Although Equation (110.951) is well known, intuitive, and conceptually simple, 
it requires the collection of all observation vectors before (j>i can be evaluated. 
This may be a costly process in terms of power and transmission bandwidth and 
perhaps not a practical option in some cases. 

In the two-step scheme (or indirect method) 0j is estimated using the 
incomplete data {xj(n) }^ i=] . Invoking the ML principle, we can write 

N 

0j = argmax y ^l(xj(n)\0j)), j = 1,..., L (10.96) 

0/eR „_i 


A T 

The L estimates {@j}j = \ are independent random variables. Also, since the esti- 

/V 

mate 0j is obtained by the y'th sensor separately and without knowing what the 

/V a — 

other values of {0k)k^j are, there is no guarantee that 0e&. Moreover, while 
V0e0 we have 0 e ^ by the injective property of g(-), we do not necessarily 

/V — 

have inverse mapping for any 0 not in 0. Thus, the ML invariance theorem 
( Scharft [l99oL p. 218) is not applicable in this case. 

yv 

We now use 0 for estimating 0. Recall that 0 is estimated using the ML 
principle, and it is well known that under certain regularity conditions and 

/V 

N ^ oo, 0 is normally distribut ed with mean Op (the true value) and covari¬ 


ance F n , where Fq is the FIM (I Cramer! 


o 


19461) . In other words, ^/N(0 — 0q) 


Af(0, ATF 0 l y Here the FIM is a diagonal matrix and the nonzero elements are 




















Chapter 


Direct Position Determination 


given by 


[F(0)] M =yVE 


9l(x,-(ra)|fl,-) 

dOi 


(10.97) 


and F 0 =F(0 O ). 

Therefore, a reasonable estimate of the primary parameter vector 0 can 

/V ry-> /V 

obtained by minimizing a cost function of the form [0 — g(0)] W[0 — g(0)], 
where W is a diagonal weighting matrix. The element W jj should reflect the 
relative confidence in estimating 6j, so the choice of W )j = fFo]yj is natural. 
Unfortunately, Fo may depend on the true value which is not known. Thus, 

/V 

we propose the use of W = F(0), which is a random diagonal matrix. The second 
step in estimating 0 is therefore given by 


02 = argmin C(0|0) (10.98) 

am = [0-g(0)f F(0)[0-g(0)] (10.99) 

This concludes the description of the single-step (or direct) estimation 
procedure and the proposed two-step (indirect) estimation procedure. 


10.4.2 Asymptotic Performance 


We now evaluate the asymptotic covariance of both estimators. First note that 
the single-step estimator in Equation (110.951) is the ML estimator that under 
regularity conditions converges in probability to the true parameter value as 
N -> oo. Further, 0i is asymptotically normal and efficient. Thus (ICrameriJ 19461) . 


VlV(0i -0 O ) ~V(0, Ei) 


( 10 . 100 ) 


where 0o is the true parameter value and the covariance is given by 


E“*=E 


9£(x(«)|g(0)) \ / 9£(x(«)|g(0)) \ r 
90 /I 90 / 


( 10 . 101 ) 


Note that 9^(x(n)|g(0))/90 is a column vector evaluated at the true parameter 
value 0 O . Using the chain rule, we have 


9^(x(«)|g(0)) dl(x(ri)\$) dO T dl(x{ri)\0) ^dl(x(n)\0) 

90 “ 90 “"90" 30 “ dO 


( 10 . 102 ) 


where G is a M xL matrix defined by 



90 90 


(10.103) 
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and evaluated at 0o. By substituting Equation (IIP.1021) in Equation (IIP.1011) we 
obtain 


D7 1 =GE 


di(x(n)\0 

do 


)( 


di(x(n)\0\ 

90 / 


G r = (1 /N)G¥ 0 G t (10.104) 


Thus, the asymptotic covariance of is given by (GFoG r ) 1 . This is also 
recognized as the CRLB on the covariance of any unbiased estimator of 0. 

We turn to the two-step estimator in Equation (110.981) . Because the estimate 
09 is the minimizer of C(0|0), (9C/90). a = 0 . Also, the true parameter 0o is 

the minimizer of C(0|0o) and therefore (9C/90) O = 0, where (-)o denotes an 

/V 

expression evaluated at 0 = 0 O , 0 = 0o- 

Consider the first-order Taylor series expansion, 

d 2 c 


dC 

9 0/0 2 ,0 


dc 


+ 


(02 — 0 o) 


0 


+ 


90/ o 

d 2 C \ 

(O-0 o )+O(l/N) 


(10.105) 


) 


909(9/ Q 

As indicated, the left expression is 0, as is the first expression on the right side. 
Thus, 


9 2 C\ / 9 2 C \ 

2/ (02-0o) = | —) (o-e 0 )+o(i/N) 


d(j)‘ 


0 


d</)d0 


) 


(10.106) 


o 


Observe that 


d 2 c 

w 

d 2 C \ 

9090/, 


= 2GF 0 G J 


(10.107) 


= —2GF 0 


(10.108) 


Substituting Equation (110.107b and (110.108b in Equation dl 0.106b yields 


GF 0 G r (0 2 — 0 O ) = GF 0 (0- 0 O ) + 0( 1 /N) 


(10.109) 


Thus, noting that E j (0 — 0o) (0 — 0o ) r J = F 0 1 , for large enough N we get 

£2 = e{( 0 —0 O )(0 —0 o ) r } 

= (GFoG r )- 1 GF 0 F- 1 FoG r (GF 0 G r ) 

= (GF 0 G r )-‘ 


Tx-1 


( 10 . 110 ) 


We see that, although a one-step estimator is expected to perform better than 
the two-step estimator, asymptotically (for large N ) they are both unbiased and 
achieve the CRLB. 
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10.4.3 Illustrative Example 

An example that illustrates the principles just discussed may be of interest. 
Consider the observation model 


Xj(n) = y/fy + ej(n) = y/c^ + ej(n); n=l,...,N 


( 10 . 111 ) 


where {aj} are real positive known scalars, 0 > 0 is the primary scalar parameter, 
0j = ctj<p, and {ej(n)} are i.i.d. Gaussian variables, ej(n) a 2 ). Notice that 

0 = g((l)) = [ai,...a L ] T (j). 

The true ML estimate of 0 is given by maximizing the LLF of the observations 
in Equation (110.1111) . The single-step true ML estimate is thus 


L N 

01 = argmin xj (n) — yjotjc/) 

7=1 n= l 


i2 


0 



( 10 . 112 ) 


where xj = T Yln =l x j( n )- The two-step estimator first finds an estimate for 0j by 
maximizing the LLF function of {xj (n)}: 


°j = 


N 2 

argmin V' I xj(n) — 
f)- L 

n= 1 


-v 2 

j 


(10.113) 


The diagonal elements of the FIM of 0 are given by 


[Fob, = 


N 


4cr 2 0j 


(10.114) 


Substituting 6j for 0j in Equation dlO.1141) and then substituting the result in 
Equation (110.981) yields 




02 = 






(10.115) 


It can be easily verified that in general 0i / 02 - However, for L= 1 the two 
estimators yield the same result. 

To check the performance of the estimators we performed MC computer sim¬ 
ulations. The following parameter values were selected: 0o = 1, cr 2 = 3, aj= j , 
7 = 1,2,..., 10. In Figures 110.161 and 110.171 we plot the empirical bias and 
empirical variance of each estimator. Each point on the plot is based on 1000 
independent trials. As expected, for large N both estimators are unbiased and 
have the same variance. For smaller values of N the direct estimator is better. 

In conclusion, two-step estimation schemes in multisensor applications are 
inferior to single-step ML estimation. We proved here that by using proper 
weighting in the second step the two-step estimator becomes asymptotically 
unbiased and efficient. 
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FIGURE 10.16 Bias of each of the two estimators versus data size. 



FIGURE 10.17 Variance of each of the two estimators versus data size. 
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10.5 CONCLUSION 

DPD is a promising approach that yields superior results in difficult conditions 
such as low SNR, model errors, and NLOS. It requires good synchronization 
between base stations, both in frequency and time, which can be easily achieved 
by exploiting the global positioning system (GPS). It also requires the transfer 
of raw data between stations and therefore larger communication bandwidth. 

Despite these drawbacks, the advantages of DPD, in precision, model error 
mitigation, and complexity, should make it the preferred method in many 
applications. 
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